from:"monty"

[Pharo-users] Re: Standalone html builder (a la seaside without seaside?)

2020-09-30 Thread monty

Yes, XMLWriter can do this. Use can also use:

 

xmlWriter outputsSelfClosingTags: false

 

before writing to get more HTML-like output

___
montyos.wordpress.com

 
 

Sent: Tuesday, September 29, 2020 at 3:08 PM
From: "Santiago Bragagnolo" 
To: "Any question about pharo is welcome" 
Subject: [Pharo-users] Re: Standalone html builder (a la seaside without seaside?)



Hi Tim.

My two cents: I am using XML Writer (from the catalog) and i am really happy so far.

 


El lun., 28 sept. 2020 a las 19:28, Tim Mackinnon () escribió:

Hi - has anyone ever managed to extract the html builder out of seaside - or written something equivalent?

I often find I want to build some HTML, but don’t want the full seaside - and was wondering if anyone has managed to extract it, or have something similar?

This combined with Renoir from BA-ST would give a good little light weight web potential to run with Zinc.

Tim

Re: [Pharo-users] Naming parameters - conventions?

2018-07-12 Thread monty

The primary trade-off is between Type Suggesting vs. Role Suggesting parameter 
names.

For example, aString and aSymbol tell us only what type of object the parameter 
expects, while aName and aTitle tell us the *role* (or purpose) of the 
parameter, with the expected type hopefully obvious.

You can combine the two, like aNameString, or aTitleSymbol, and this should be 
done where necessary to prevent confusion. For example, in XMLParser I have a 
URI class, XMLURI, and to avoid confusion over parameters that accept both 
XMLURIs and URI strings, I use anXMLURIOrURIString.

Using "a" and "an" prefixes is always a good idea, because it prevents 
collisions with instance variables, allows you to tell at a glance that an 
identifier is a parameter and not an inst var, and it produces more natural, 
readable message signatures, like "copyFrom: anOldPath to: aNewPath".

> Sent: Wednesday, July 11, 2018 at 6:24 PM
> From: "Tim Mackinnon" 
> To: "Pharo Users Newsgroup" 
> Subject: [Pharo-users] Naming parameters - conventions?
>
> Hi everyone, something I’ve meant to ask over the years, as I’ve seen lots of 
> variation and was taught something else in the day...
> 
> What is the suggested way of naming parameters?
> 
> I was taught {“a”/“an”}DataType, so it would be:
> 
> #name: aString
> 
> Which works ok (although falls apart if you refactor as the tools don’t 
> interpret it - although I guess could be improved to do so)
> 
> However often I find myself wanting to communicate a bit better as in:
> 
> #name: fullNameString
> 
> Which isn’t strictly a datatype (and I tend to leave out the a/an when I do 
> this). But it feels a bit off piste and it does make me consider whether my 
> selector is named badly and should be:
> 
> #fullName: aString
> 
> Which takes me back to the convention I learned long ago.
> 
> This said however, we often need to match similar #on:do, #in: generic 
> selector names and then it’s not always obvious the intent of parameter.
> 
> Any thoughts to share?
> 
> I ask because for exercism, we should try and set a good example.
> 
> Tim
> 
> Sent from my iPhone
> 
> 
___
montyos.wordpress.com

Re: [Pharo-users] XML support for pharo

2018-07-11 Thread monty

This is the latest version of the XML/XPath Scraping Booklet: 


___
montyos.wordpress.com

Re: [Pharo-users] Lost in stream

2018-06-20 Thread monty

https://ci.inria.fr/pharo-contribution/job/XMLParser/

You should be able to upgrade regardless of your Pharo version.
___
montyos.wordpress.com


> Sent: Wednesday, June 20, 2018 at 8:39 AM
> From: Hilaire 
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] Lost in stream
>
> No with Pharo, and pretty old XML parser version. But still this one was 
> always capable to parse entity
> 
> 
> Le 18/06/2018 à 02:43, monty a écrit :
> > Are you developing in Squeak? I ask because the XML parser in the default 
> > Squeak image is very old. You can download the latest Pharo XMLParser lib 
> > from the SqueakMap. It's the "XMLParser" project. Other libs I maintain are 
> > also available as SM projects, like "XMLParser-XPath".
> >
> > If you're doing DOM parsing, you can save a parsed DOM document to a file 
> > using a message like #printToFileNamed: (see the XMLNode "printing" 
> > category for more).
> 
> -- 
> Dr. Geo
> http://drgeo.eu
> 
> 
> 
>

Re: [Pharo-users] Lost in stream

2018-06-18 Thread monty

They still use (binary) StandardFileStreams on Pharo and Squeak. But since it's 
done through dynamically chosen file stream factory classes 
(XMLFileReadStreamFactory and XMLFileWriteStreamFactory), it's easy add support 
for other stream classes. (The GemStone compat .mcz adds factories for GsFile 
read/write factories, for example).

#preferredImplementation selects which implementation to use in the hierarchy 
when there's more than one supported (#isSupportedImplementation).
___
montyos.wordpress.com

> Sent: Monday, June 18, 2018 at 1:34 AM
> From: "Sven Van Caekenberghe" 
> To: "Any question about pharo is welcome" 
> Subject: Re: [Pharo-users] Lost in stream
>
> 
> 
> > On 18 Jun 2018, at 02:18, monty  wrote:
> > 
> > Also consider using XMLParser's built-in file reading support: 
> > #parseFileNamed:/#onFileNamed:. They work cross platform (Squeak, GS), and 
> > handle character decoding.
> 
> Monty, do these (already) work with all the latest changes in Pharo 7, I mean 
> the deprecation of FileStream and subclasses as well as 
> [RW|MultiByte]BinaryOrTextStream for FileReference, File and Zn streams ?
> 
> If not, we should help you.
> 
> Sven
>

Re: [Pharo-users] Lost in stream

2018-06-17 Thread monty

Are you developing in Squeak? I ask because the XML parser in the default 
Squeak image is very old. You can download the latest Pharo XMLParser lib from 
the SqueakMap. It's the "XMLParser" project. Other libs I maintain are also 
available as SM projects, like "XMLParser-XPath".

If you're doing DOM parsing, you can save a parsed DOM document to a file using 
a message like #printToFileNamed: (see the XMLNode "printing" category for 
more).

___
montyos.wordpress.com


> Sent: Saturday, June 16, 2018 at 4:27 AM
> From: Hilaire 
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] Lost in stream
>
> Sometime you just need a good sleep, which I was fortunate to have.
> 
> My newer code append contents to existing file, so it lead to corrupted 
> files at the end. Ensuring it is deleted first solved the problem:
> 
> (location  asFileReference / filename) ensureDelete binaryWriteStreamDo: 
> [ :fileStream |
>         fileStream nextPutAll: stream contents]
> 
> There is still the issue of XML entity I described in an previous email, 
> the impact is minor for Dr. Geo though. Do other have issues with that?
> 
> Hilaire
> 
> 
> Le 15/06/2018 à 18:31, Hilaire a écrit :
> > This fromer code to save file (xml or PNG file):
> >
> > " |streamOnDisk|
> >     [streamOnDisk := MultiByteFileStream forceNewFileNamed: (self 
> > absolutePath: filename).
> >     streamOnDisk nextPutAll: stream contents] ensure:
> >         [streamOnDisk close]"
> >
> > and the newer one:
> >
> >     (location  asFileReference / filename) binaryWriteStreamDo: [ 
> > :fileStream |
> >         fileStream nextPutAll: stream contents]
> >
> > The new one produce both wrong XML file and PNG file. The file looks 
> > short cuted at the end.
> 
> -- 
> Dr. Geo
> http://drgeo.eu
> 
> 
> 
>

Re: [Pharo-users] Lost in stream

2018-06-17 Thread monty

Print-evaluate this:
'
' parseXML

and you should get (with any recent XMLParser lib):


If you don't want entity replacement, set #replacesContentEntityReferences: to 
false. Also consider using XMLParser's built-in file reading support: 
#parseFileNamed:/#onFileNamed:. They work cross platform (Squeak, GS), and 
handle character decoding.

___
montyos.wordpress.com


> Sent: Saturday, June 16, 2018 at 4:27 AM
> From: Hilaire 
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] Lost in stream
>
> Sometime you just need a good sleep, which I was fortunate to have.
> 
> My newer code append contents to existing file, so it lead to corrupted 
> files at the end. Ensuring it is deleted first solved the problem:
> 
> (location  asFileReference / filename) ensureDelete binaryWriteStreamDo: 
> [ :fileStream |
>         fileStream nextPutAll: stream contents]
> 
> There is still the issue of XML entity I described in an previous email, 
> the impact is minor for Dr. Geo though. Do other have issues with that?
> 
> Hilaire
> 
> 
> Le 15/06/2018 à 18:31, Hilaire a écrit :
> > This fromer code to save file (xml or PNG file):
> >
> > " |streamOnDisk|
> >     [streamOnDisk := MultiByteFileStream forceNewFileNamed: (self 
> > absolutePath: filename).
> >     streamOnDisk nextPutAll: stream contents] ensure:
> >         [streamOnDisk close]"
> >
> > and the newer one:
> >
> >     (location  asFileReference / filename) binaryWriteStreamDo: [ 
> > :fileStream |
> >         fileStream nextPutAll: stream contents]
> >
> > The new one produce both wrong XML file and PNG file. The file looks 
> > short cuted at the end.
> 
> -- 
> Dr. Geo
> http://drgeo.eu
> 
> 
> 
>

Re: [Pharo-users] Why do #select:thenXxx methods not return their result

2018-06-07 Thread monty

Not just more readable. They can also be more efficient. Look at 
#select:thenCollect: in OrderedCollection:
select: selectBlock thenCollect: collectBlock
" Optimized version Collection>>#select:thenCollect: "

| newCollection element |

newCollection := self copyEmpty.

firstIndex to: lastIndex do: [ :index |
element := array at: index.
(selectBlock value: element) 
ifTrue: [ newCollection addLast: (collectBlock value: 
element) ]].

^ newCollection

It only uses one temp collection, where a #select: followed by a separate 
#collect: would need two.

___
montyos.wordpress.com


> Sent: Thursday, June 07, 2018 at 8:20 AM
> From: "Tim Mackinnon" 
> To: "Pharo Users Newsgroup" 
> Subject: [Pharo-users] Why do #select:thenXxx methods not return their result
>
> Hi - are the methods like #select:thenCollect: frowned upon? 
> 
> They seem quite readable , however in using them I’ve noticed that unlike the 
> core methods they done return the result of evaluation (they are missing a 
> ^). This is a shame, but possibly an oversight?
> 
> Tim
> 
> Sent from my iPhone
> 
>

Re: [Pharo-users] configurable space in XML Writer tag

2018-04-06 Thread monty

It's changed to no longer emit that space. I was considering removing it, or at 
least making it configurable, anyway.

Re: [Pharo-users] Plans for XSD in XML(Parser) the foreseeable future?

2018-04-06 Thread monty

Unfortunately not. It's a major effort, and I'm not able to devote the time to 
it now.

Re: [Pharo-users] Problem with input to XML Parser - 'Invalid UTF8 encoding'

2017-10-10 Thread monty

I know what the problem is and will have it fixed shortly. Thanks for the 
report.

> Sent: Monday, October 09, 2017 at 9:03 AM
> From: "Peter Kenny" <pe...@pbkresearch.co.uk>
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] Problem with input to XML Parser - 'Invalid UTF8 
> encoding'
>
> Correction - I am misrepresenting Sven. What he said was that Zinc would not
> look inside the HTML  node to find out about coding. It would of
> course use information in the HTTP headers, if any.
> 
> 
> Peter Kenny wrote
> > Henry
> > 
> > Thanks for the explanations. It's a bit clearer now. I'm still not sure
> > about how ZnUrl>>retrieveContents manages to decode correctly in this
> > case;
> > I'm sure I recall Sven saying it didn't (and in his view shouldn't) look
> > at
> > the HTTP declarations in the header. There is also the mystery of how the
> > string reader in the XML-Parser package (XMLURI>>get) does the same trick,
> > when it is presumably what XMLHTMLParser>>parseURL: uses and fails.
> > 
> > However, all these are second order problems. It all begins because the
> > Corriere web site does strange things with encoding, including using a
> > UTF8
> > character in a page coded with 8859-1, as Paul pointed out. In any case,
> > reading the page as a string and then parsing it solves my problem, so I
> > shall stick to that as a standard procedure. Most importantly, I don't
> > think
> > there is any indication of a problem in the XML package for Monty to worry
> > about.
> > 
> > Thanks again
> > 
> > Peter
> > 
> > 
> > 
> > --
> > Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
> 
> 
> 
> 
> 
> --
> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
> 
>

Re: [Pharo-users] Writing XML

2017-09-15 Thread monty



> Sent: Friday, September 15, 2017 at 4:30 PM
> From: "Jimmie Houchin" <jlhouc...@gmail.com>
> To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] Writing XML
>
> I didn't pay attention to this previously. But I just noticed that using 
> #printToFileNamed:   preserved the DOM tree's original line ending where 
> as previously I had to insure the the XMLWriter #lineEnding was changed 
> from defaultLineEnding to canonicalLineEnding. The original document 
> used LF and not CR.

To clarify, #printToFileNamed: and company use CRLF on Windows and LF 
elsewhere. XMLWriter uses Pharo's LE by default (CR), but it will use the 
preferred LE of your platform (LF or CRLF) with 
#enablePlatformSpecificLineBreak, LF when canonical XML 
(https://www.w3.org/TR/xml-c14n) is enabled, or whatever LE you want with 
#lineBreak:.

You can use #printToFileNamed:beforeWritingDo: with a block that sends 
#lineBreak: to the writer argument to get a custom LE when printing a DOM tree 
to a file.

> Overall this was a nice win.
> It cleaned up my method to save the file and reduced 7 lines to 2.
> Nice.  :)
> 
> Thanks.
> 
> Jimmie
> 
> 
> 
> On 09/15/2017 09:29 AM, monty wrote:
> > If you want to write a DOM tree to a file, send #printToFileNamed: (or a 
> > related message like #canonicallyPrintToFileNamed: or 
> > #printToFileNamed:beforeWritingDo:) to the root. See the XMLNode "printing" 
> > category for more. This will automatically encode the file with the 
> > encoding the XMLDocument>>#encoding attribute specifies (if recognized), 
> > and it's portable across Pharo, Squeak, and GemStone. Use 
> > #parseFileNamed:/#onFileNamed: to get portable automatic file decoding when 
> > parsing.
> >
> >> Sent: Wednesday, September 13, 2017 at 1:02 PM
> >> From: "Jimmie Houchin" <jlhouc...@gmail.com>
> >> To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> >> Subject: [Pharo-users] Writing XML
> >>
> >> Hello,
> >>
> >> I am attempting to read and write an XML document.
> >>
> >> Currently I have parsed the document successfully. I have basic
> >> navigation and have learned how to modify the XMLDocument.
> >>
> >> Now I want to write the modified document back to the file system.
> >> What I have tried so far is:
> >>
> >> writer := XMLWriter new.
> >> xmldoc document writeXMLOn: writer.
> >> writer stream.
> >> f := File openForWriteFileNamed: '/home/jimmie/xmldoc.xml'.
> >> f nextPutAll: (writer write contents).
> >> f flush.
> >> f close.
> >>
> >> It does write an xml document to the file system. However, it has
> >> exploded in size. The original is 28mb and is in UTF-8. The newly
> >> written file is 112mb and is UTF-32.
> >>
> >> I do not know why the change in encoding or how to correct or manually
> >> set the encoding.
> >>
> >> Any help in understanding how to correctly write an XML document that I
> >> have read and minimally modified would be greatly appreciated.
> >>
> >> Thanks.
> >>
> >> Jimmie
> >>
> >>
> 
> 
>

Re: [Pharo-users] Writing XML

2017-09-15 Thread monty

If you want to write a DOM tree to a file, send #printToFileNamed: (or a 
related message like #canonicallyPrintToFileNamed: or 
#printToFileNamed:beforeWritingDo:) to the root. See the XMLNode "printing" 
category for more. This will automatically encode the file with the encoding 
the XMLDocument>>#encoding attribute specifies (if recognized), and it's 
portable across Pharo, Squeak, and GemStone. Use #parseFileNamed:/#onFileNamed: 
to get portable automatic file decoding when parsing.

> Sent: Wednesday, September 13, 2017 at 1:02 PM
> From: "Jimmie Houchin" 
> To: "Any question about pharo is welcome" 
> Subject: [Pharo-users] Writing XML
>
> Hello,
> 
> I am attempting to read and write an XML document.
> 
> Currently I have parsed the document successfully. I have basic 
> navigation and have learned how to modify the XMLDocument.
> 
> Now I want to write the modified document back to the file system.
> What I have tried so far is:
> 
> writer := XMLWriter new.
> xmldoc document writeXMLOn: writer.
> writer stream.
> f := File openForWriteFileNamed: '/home/jimmie/xmldoc.xml'.
> f nextPutAll: (writer write contents).
> f flush.
> f close.
> 
> It does write an xml document to the file system. However, it has 
> exploded in size. The original is 28mb and is in UTF-8. The newly 
> written file is 112mb and is UTF-32.
> 
> I do not know why the change in encoding or how to correct or manually 
> set the encoding.
> 
> Any help in understanding how to correctly write an XML document that I 
> have read and minimally modified would be greatly appreciated.
> 
> Thanks.
> 
> Jimmie
> 
>

Re: [Pharo-users] Thread-safe initialization of class state (was Re: Threads safety in Pharo)

2017-08-11 Thread monty

> Sent: Friday, August 11, 2017 at 6:36 AM
> From: "Denis Kudriashov" <dionisi...@gmail.com>
> To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] Thread-safe initialization of class state (was Re: 
> Threads safety in Pharo)
> 
> What package you explore? I not find fileTypes method in Pharo 7. 

Like I said, it's a hypothetical example. But I'm sure you could find similar 
examples of unsafe class state initialization in the image and in popular 
libraries.

> 2017-08-11 8:53 GMT+02:00 monty 
> <mon...@programmer.net[mailto:mon...@programmer.net]>:Here's a hypothetical 
> broken class method that does lazy initialization of a class inst var:
> 
> fileTypes
> fileTypes ifNil: [
> fileTypes := Dictionary new.
> fileTypes
> at: 'txt' put: 'Text File';
> at: 'html' put: 'Web Page';
> at: 'pdf' put: 'Portable Document Format File';
> at: 'doc' put: 'Microsoft Word Document'].
> ^ fileTypes.
> 
> Because the assignment is done first and the initialization is done after 
> with a cascade of interruptable sends of #at:put:, there's a window after the 
> assignment where 'fileTypes' is not nil but also not fully initialized--a 
> race condition.
> 
> The fix is simple. Do the initialization before the atomic assignment takes 
> place, so the var is only ever bound to nil or a fully initialized object:
> 
> fileTypes
> fileTypes ifNil: [
> fileTypes :=
> Dictionary new
> at: 'txt' put: 'Text File';
> at: 'html' put: 'Web Page';
> at: 'pdf' put: 'Portable Document Format 
> File';
> at: 'doc' put: 'Microsoft Word Document';
> yourself].
> ^ fileTypes.
> 
> The fixed code is still vulnerable to duplicate initialization, because the 
> initialization sequence is interruptable and 'fileTypes' is nil during it, 
> but as long as the initialization is cheap enough, has no side effects that 
> restrict how often it can be done, and it's enough that the initialized 
> objects are equal (but not identical), that's OK.
> 
> If it's too complex for a single statement, you can use a temp vars or put it 
> in a separate factory method:
> 
> fileTypes
> fileTypes ifNil: [
> fileTypes := self newFileTypes].
> ^ fileTypes.
> 
> Similar precautions (given how easy) might as well be taken with explicit 
> initialization of class state too. Of course if the object is mutated later 
> (in other methods), then Mutexes or other constructs are needed to guard 
> access. But for immutable class state, ensuring initialization is done before 
> assignment should be enough.
> 
> > Sent: Tuesday, August 01, 2017 at 7:36 AM
> > From: "Stephane Ducasse" 
> > <stepharo.s...@gmail.com[mailto:stepharo.s...@gmail.com]>
> > To: "Any question about pharo is welcome" 
> > <pharo-users@lists.pharo.org[mailto:pharo-users@lists.pharo.org]>
> > Subject: Re: [Pharo-users] Threads safety in Pharo
> >
> > I would love to have an analysis of assumptions made in some code.
> > Because my impression is that the concurrent code is sometimes defined
> > knowing the underlying logic of scheduler and this is not good.
> > As I said to abdel privately in french it would be great to start from
> > my french squeak book (Yes I wrote one long time ago) chapter on
> > concurrent programming and turn it into a pharo chapter.
> >
> > Stef
> >
> > On Tue, Aug 1, 2017 at 1:31 PM, Ben Coman 
> > <b...@openinworld.com[mailto:b...@openinworld.com]> wrote:
> > > Not sure I'll have what you're looking for, but to start, do you mean
> > > Pharo's green threads or vm native threads?
> > > cheers -ben
> > >
> > > On Mon, Jul 31, 2017 at 7:38 AM, Alidra Abdelghani via Pharo-users
> > > <pharo-users@lists.pharo.org[mailto:pharo-users@lists.pharo.org]> wrote:
> > >>
> > >>
> > >>
> > >> -- Forwarded message --
> > >> From: Alidra Abdelghani <alidran...@yahoo.fr[mailto:alidran...@yahoo.fr]>
> > >> To: pharo-users@lists.pharo.org[mailto:pharo-users@lists.pharo.org]
> > >> Cc: "Stéphane Ducasse" 
> > >> <stephane.duca...@inria.fr[mailto:stephane.duca...@inria.fr]>, farid arfi
> >

Re: [Pharo-users] Thread-safe initialization of class state (was Re: Threads safety in Pharo)

2017-08-11 Thread monty

> Sent: Friday, August 11, 2017 at 5:51 AM
> From: "Tim Mackinnon" <tim@testit.works>
> To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] Thread-safe initialization of class state (was Re: 
> Threads safety in Pharo)
> 
> Interesting, your example was subtle enough that I had to read it a few times 
> to understand the issue. (I was caught up on the ifNil and not the contents).

That's the problem. They're close enough that someone could mistakenly refactor 
the correct code into the incorrect code--and that code would be perfectly fine 
for lazy initialization of instance state in a non-concurrent class. That's why 
I always insert a small comment before explaining why it was written that way.
 
> Actually, thinking about it - isn't the real issue the #ifNil - you want an 
> atomic version.
>  
> It strikes me you could wrap the whole concept into something like an 
> AtomicLazyVariable? E.g.
>  
> initialise
>fileTypes := AtomicLazyVariable using: [
> 
>Dictionary new
>at: 'txt' put: 'Text File';
>at: 'html' put: 'Web Page';
>yourself ].
>  
> Then have something
>  
> fileTypes 
>  ^fileTypes value
>  
> Where you have a critical section in #value?
>  
> Tim
>  
> Sent from my iPhone
> On 11 Aug 2017, at 07:53, monty 
> <mon...@programmer.net[mailto:mon...@programmer.net]> wrote:
>  
> Here's a hypothetical broken class method that does lazy initialization of a 
> class inst var:
> 
> fileTypes
>fileTypes ifNil: [
>fileTypes := Dictionary new.
>fileTypes
>at: 'txt' put: 'Text File';
>at: 'html' put: 'Web Page';
>at: 'pdf' put: 'Portable Document Format File';
>at: 'doc' put: 'Microsoft Word Document'].
>^ fileTypes.
> 
> Because the assignment is done first and the initialization is done after 
> with a cascade of interruptable sends of #at:put:, there's a window after the 
> assignment where 'fileTypes' is not nil but also not fully initialized--a 
> race condition.
> 
> The fix is simple. Do the initialization before the atomic assignment takes 
> place, so the var is only ever bound to nil or a fully initialized object:
> 
> fileTypes
>fileTypes ifNil: [
>fileTypes :=
>Dictionary new
>at: 'txt' put: 'Text File';
>at: 'html' put: 'Web Page';
>at: 'pdf' put: 'Portable Document Format File';
>at: 'doc' put: 'Microsoft Word Document';
>yourself].
>^ fileTypes.
> 
> The fixed code is still vulnerable to duplicate initialization, because the 
> initialization sequence is interruptable and 'fileTypes' is nil during it, 
> but as long as the initialization is cheap enough, has no side effects that 
> restrict how often it can be done, and it's enough that the initialized 
> objects are equal (but not identical), that's OK.
> 
> If it's too complex for a single statement, you can use a temp vars or put it 
> in a separate factory method:
> 
> fileTypes
>fileTypes ifNil: [
>fileTypes := self newFileTypes].
>^ fileTypes.
> 
> Similar precautions (given how easy) might as well be taken with explicit 
> initialization of class state too. Of course if the object is mutated later 
> (in other methods), then Mutexes or other constructs are needed to guard 
> access. But for immutable class state, ensuring initialization is done before 
> assignment should be enough.
>  Sent: Tuesday, August 01, 2017 at 7:36 AMFrom: "Stephane Ducasse" 
> <stepharo.s...@gmail.com[mailto:stepharo.s...@gmail.com]>To: "Any question 
> about pharo is welcome" 
> <pharo-users@lists.pharo.org[mailto:pharo-users@lists.pharo.org]>Subject: Re: 
> [Pharo-users] Threads safety in Pharo I would love to have an analysis of 
> assumptions made in some code.Because my impression is that the concurrent 
> code is sometimes definedknowing the underlying logic of scheduler and this 
> is not good.As I said to abdel privately in french it would be great to start 
> frommy french squeak book (Yes I wrote one long time ago) chapter 
> onconcurrent programming and turn it into a pharo chapter. Stef On Tue, Aug 
> 1, 2017 at 1:31 PM, Ben Coman 
> <b...@openinworld.com[mailto:b...@openinworld.com]> wrote:Not sure I'll have 
> what you're looking for, but to start, do you meanPharo's green threads or vm 
> native threads?cheers -ben On Mon, Jul 31, 2017 at 7:38 AM, Alidra Abdelghani 
> via 
> Pharo-users<pharo-users@lists.pharo.org[mailto:pharo-users@lists.pharo.org]> 
> wro

[Pharo-users] Thread-safe initialization of class state (was Re: Threads safety in Pharo)

2017-08-11 Thread monty

Here's a hypothetical broken class method that does lazy initialization of a 
class inst var:

fileTypes
fileTypes ifNil: [
fileTypes := Dictionary new.
fileTypes
at: 'txt' put: 'Text File';
at: 'html' put: 'Web Page';
at: 'pdf' put: 'Portable Document Format File';
at: 'doc' put: 'Microsoft Word Document'].
^ fileTypes.

Because the assignment is done first and the initialization is done after with 
a cascade of interruptable sends of #at:put:, there's a window after the 
assignment where 'fileTypes' is not nil but also not fully initialized--a race 
condition.

The fix is simple. Do the initialization before the atomic assignment takes 
place, so the var is only ever bound to nil or a fully initialized object:

fileTypes
fileTypes ifNil: [
fileTypes :=
Dictionary new
at: 'txt' put: 'Text File';
at: 'html' put: 'Web Page';
at: 'pdf' put: 'Portable Document Format File';
at: 'doc' put: 'Microsoft Word Document';
yourself].
^ fileTypes.

The fixed code is still vulnerable to duplicate initialization, because the 
initialization sequence is interruptable and 'fileTypes' is nil during it, but 
as long as the initialization is cheap enough, has no side effects that 
restrict how often it can be done, and it's enough that the initialized objects 
are equal (but not identical), that's OK.

If it's too complex for a single statement, you can use a temp vars or put it 
in a separate factory method:

fileTypes
fileTypes ifNil: [
fileTypes := self newFileTypes].
^ fileTypes.

Similar precautions (given how easy) might as well be taken with explicit 
initialization of class state too. Of course if the object is mutated later (in 
other methods), then Mutexes or other constructs are needed to guard access. 
But for immutable class state, ensuring initialization is done before 
assignment should be enough.

> Sent: Tuesday, August 01, 2017 at 7:36 AM
> From: "Stephane Ducasse" 
> To: "Any question about pharo is welcome" 
> Subject: Re: [Pharo-users] Threads safety in Pharo
>
> I would love to have an analysis of assumptions made in some code.
> Because my impression is that the concurrent code is sometimes defined
> knowing the underlying logic of scheduler and this is not good.
> As I said to abdel privately in french it would be great to start from
> my french squeak book (Yes I wrote one long time ago) chapter on
> concurrent programming and turn it into a pharo chapter.
> 
> Stef
> 
> On Tue, Aug 1, 2017 at 1:31 PM, Ben Coman  wrote:
> > Not sure I'll have what you're looking for, but to start, do you mean
> > Pharo's green threads or vm native threads?
> > cheers -ben
> >
> > On Mon, Jul 31, 2017 at 7:38 AM, Alidra Abdelghani via Pharo-users
> >  wrote:
> >>
> >>
> >>
> >> -- Forwarded message --
> >> From: Alidra Abdelghani 
> >> To: pharo-users@lists.pharo.org
> >> Cc: "Stéphane Ducasse" , farid arfi
> >> 
> >> Bcc:
> >> Date: Mon, 31 Jul 2017 01:38:58 +0200
> >> Subject: Threads safety in Pharo
> >> Hi,
> >>
> >> Somebody once evoked the problem of threads safety in Pharo. With a friend
> >> of mine who is expert in formal methods and process scheduling, we would
> >> like to have a look on it.
> >> Does anyone knows a good document describing the problem of Pharo with
> >> threads safety or at least any document that we can start with?
> >>
> >> Thanks in advance,
> >> Abdelghani
> >>
> >>
> >>
> >
>

Re: [Pharo-users] XPath has unresolved dependencies

2017-06-05 Thread monty

Thanks, it should be fixed now.
 

Sent: Monday, June 05, 2017 at 6:50 AM
From: "Henrik Nergaard" 
To: "Any question about pharo is welcome" , "mon...@programmer.net" 
Subject: XPath has unresolved dependencies




Hi,

 

Loading XPath from the catalog browser into the latest image raises a warning that there are unresolved references, and that the package depends on the following classes: XMLHighlighter, XMLHighlighterDefaults, GLMXMLHighlighterTextStylerDecorator. 

 

The regular XPath works fine ignoring the warnings, but inspecting an XML document and clicking the XPath pane gives an error.

 

 

Best regards,

Henrik

Re: [Pharo-users] Dictionary whose values are dictionaries

2017-05-27 Thread monty

You could just use a regular Dictionary, but with association keys:
dict
at: outerKey -> innerKey
put: innerValue

Re: [Pharo-users] Problems loading XML System ( was [Zinc] ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)

2017-05-21 Thread monty

Creating two separate DOM subtrees for two descendants of the body element 
should be faster and consume less memory than creating one subtree for the 
entire body. You should also consider benchmarking different approaches, and 
using profiling to identify which of the parsing, querying, network IO, or 
whatever is your bottleneck before optimizing.

> Sent: Saturday, May 20, 2017 at 7:08 AM
> From: PBKResearch <pe...@pbkresearch.co.uk>
> To: "'Any question about pharo is welcome'" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] Problems loading XML System ( was [Zinc] 
> ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)
>
> Monty
> 
> Many thanks for your patient explanation. I would like to try one 
> supplementary question, if I may. Almost all my work involves reading HTML 
> files from the web and extracting relevant sections, and I am trying to work 
> out how to divide the effort between StAX and XPath. For example, if I am 
> reading an article from Frankfurter Allgemeine, I am looking for two tags:
>  
> 
> which contain the intro and body of the article; everything else can be 
> discarded.
> 
> Using StAX, I can find the first  with something like this (adapted from 
> your second snippet):
> 
> [((tag := parser peek) isStartTagNamed: 'div') and: [ tag hasAttributes and: 
> [(tag attributeAt: 'class') = 'FAZArtikelEinleitung']]]
> whileFalse: [parser next].
> intro := parser nextNode.
> 
> and similarly for the body. I suppose if this is common I could subclass and 
> make this a method.
> 
> Does this look sensible, and is it more efficient than reading the entire 
> body with StAX and locating the relevant sections with XPath? (I have tried 
> this snippet, and it works. My question is really whether this is the best 
> way to go about it?)
> 
> Many thanks for any advice.
> 
> Peter Kenny
> 
> 
> -Original Message-
> From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
> monty
> Sent: 17 May 2017 22:10
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] Problems loading XML System ( was [Zinc] 
> ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)
> 
> For example, this:
> ((StAXHTMLParser onURL: aURLString)
>   nextElementNamed: 'head')
>   ifNotNil: [:headElement | ...]
> 
> parses the document upto the next "head" element and returns it and any 
> descendants as a DOM subtree. If there's no next "head" element, it exhausts 
> the event stream looking for one. If you don't want that, test it first:
> (parser peek isStartTagNamed: 'head')
>   ifTrue: [| headElement |
>   headElement := parser nextNode.
>   ...].
> 
> because you now know what kind of DOM subtree the next events represent, 
> #nextNode is used, which builds any DOM subtree out of the next events, 
> including an element with descendants, a string or comment node, or even an 
> entire document (if sent before reading the start-of-document event). So this:
> (StAXHTMLParser onURL: aURLString) nextNode
> 
> is equivalent to this:
> XMLHTMLParser parseURL: aURLString.
> 
> StAX is more useful with XML than HTML, because XML documents can be huge.
> 
> > Sent: Tuesday, May 16, 2017 at 6:39 PM
> > From: PBKResearch <pe...@pbkresearch.co.uk>
> > To: "'Any question about pharo is welcome'" <pharo-users@lists.pharo.org>
> > Subject: Re: [Pharo-users] Problems loading XML System ( was [Zinc] 
> > ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)
> >
> > Monty
> > 
> > Many thanks for your help. I have followed your advice to start again in a 
> > clean Moose 6.1 image, and so far everything is working fine. Apologies for 
> > getting you to sort out the results of my stupidity. In Pharo I am really 
> > an experienced beginner.
> > 
> > Thanks again
> > 
> > Peter Kenny
> > 
> > -Original Message-
> > From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
> > monty
> > Sent: 16 May 2017 03:37
> > To: pharo-users@lists.pharo.org
> > Subject: Re: [Pharo-users] Problems loading XML System ( was [Zinc] 
> > ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)
> > 
> > Something went wrong during your upgrade with class initialization.
> > 
> > Installing the latest versions of these projects into a clean image would 
> > work, and so would installing the latest XMLParserHTML and XMLParserStAX 
> > into the newest Moose-6.1 image (which has the latest XMLParser and XPath).
> > 
> > But if you insist on upgrading yo

Re: [Pharo-users] Problems loading XML System ( was [Zinc] ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)

2017-05-17 Thread monty

For example, this:
((StAXHTMLParser onURL: aURLString)
nextElementNamed: 'head')
ifNotNil: [:headElement | ...]

parses the document upto the next "head" element and returns it and any 
descendants as a DOM subtree. If there's no next "head" element, it exhausts 
the event stream looking for one. If you don't want that, test it first:
(parser peek isStartTagNamed: 'head')
ifTrue: [| headElement |
headElement := parser nextNode.
...].

because you now know what kind of DOM subtree the next events represent, 
#nextNode is used, which builds any DOM subtree out of the next events, 
including an element with descendants, a string or comment node, or even an 
entire document (if sent before reading the start-of-document event). So this:
(StAXHTMLParser onURL: aURLString) nextNode

is equivalent to this:
XMLHTMLParser parseURL: aURLString.

StAX is more useful with XML than HTML, because XML documents can be huge.

> Sent: Tuesday, May 16, 2017 at 6:39 PM
> From: PBKResearch <pe...@pbkresearch.co.uk>
> To: "'Any question about pharo is welcome'" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] Problems loading XML System ( was [Zinc] 
> ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)
>
> Monty
> 
> Many thanks for your help. I have followed your advice to start again in a 
> clean Moose 6.1 image, and so far everything is working fine. Apologies for 
> getting you to sort out the results of my stupidity. In Pharo I am really an 
> experienced beginner.
> 
> Thanks again
> 
> Peter Kenny
> 
> -Original Message-
> From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
> monty
> Sent: 16 May 2017 03:37
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] Problems loading XML System ( was [Zinc] 
> ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)
> 
> Something went wrong during your upgrade with class initialization.
> 
> Installing the latest versions of these projects into a clean image would 
> work, and so would installing the latest XMLParserHTML and XMLParserStAX into 
> the newest Moose-6.1 image (which has the latest XMLParser and XPath).
> 
> But if you insist on upgrading your old image, try the latest 
> ConfigurationOfXMLParser (.303.mcz) and ConfigurationOfXPath (.149.mcz) from 
> their PharoExtras repos and install their latest project versions, and do the 
> same with XMLParserHTML and XMLParserStAX (the older versions aren't 
> compatible with newer XMLParser versions). Then open the test runner and run 
> all "XML|XPath" tests. If you get any failures, evaluate this:
> 
> #('XML-Parser' 'XPath-Core') do: [:package |
>   (SystemNavigation default allClassesInPackageNamed: package) do: 
> [:class |
>   class initialize]]
> 
> and try running the tests again.
> 
> > Sent: Monday, May 15, 2017 at 6:50 PM
> > From: PBKResearch <pe...@pbkresearch.co.uk>
> > To: "'Any question about pharo is welcome'" <pharo-users@lists.pharo.org>
> > Subject: [Pharo-users] Problems loading XML System ( was Re: [Zinc] 
> > ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)
> >
> > Monty
> > 
> > As an update, I have rebuilt from the Moose 6.0 download. The version of 
> > XML-Parser in that was dated 18 July 2016 (configuration monty.233), so I 
> > installed versions of XML-Parser-HTML and XML-Parser-StAX contemporary with 
> > that. (The respective configurations are monty.48 and monty.39). With these 
> > versions all my previous XMLHTMLParser operations work as before, and I 
> > have been able to use the StAX parser in a simple way. So I can start 
> > exploring as I intended.
> > 
> > I have made repeated attempts to update this rebuilt image to more recent 
> > versions of the HTML and StAX parsers, and every time I run into the same 
> > error reported below. I started from the latest version and worked 
> > backwards, but gave up quickly; it takes about 6 minutes on my machine to 
> > load and compile a version, and it soon gets tedious. If I feel more 
> > enthusiastic tomorrow, I might start working forwards from my current 
> > versions.
> > 
> > Anyway, I now have a working system with the StaX and HTML parsers, so I 
> > can continue to explore.
> > 
> > Best wishes
> > 
> > Peter Kenny
> > 
> > -Original Message-
> > From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
> > PBKResearch
> > Sent: 15 May 2017 20:44
> > To: 'Any question about pharo is welcome' <pharo-users@lists.pharo.org>
> >

Re: [Pharo-users] Problems loading XML System ( was [Zinc] ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)

2017-05-15 Thread monty

Something went wrong during your upgrade with class initialization.

Installing the latest versions of these projects into a clean image would work, 
and so would installing the latest XMLParserHTML and XMLParserStAX into the 
newest Moose-6.1 image (which has the latest XMLParser and XPath).

But if you insist on upgrading your old image, try the latest 
ConfigurationOfXMLParser (.303.mcz) and ConfigurationOfXPath (.149.mcz) from 
their PharoExtras repos and install their latest project versions, and do the 
same with XMLParserHTML and XMLParserStAX (the older versions aren't compatible 
with newer XMLParser versions). Then open the test runner and run all 
"XML|XPath" tests. If you get any failures, evaluate this:

#('XML-Parser' 'XPath-Core') do: [:package |
(SystemNavigation default allClassesInPackageNamed: package) do: 
[:class |
class initialize]]

and try running the tests again.

> Sent: Monday, May 15, 2017 at 6:50 PM
> From: PBKResearch <pe...@pbkresearch.co.uk>
> To: "'Any question about pharo is welcome'" <pharo-users@lists.pharo.org>
> Subject: [Pharo-users] Problems loading XML System ( was Re: [Zinc] 
> ZnInvalidUTF8: Illegal leading byte for utf-8 encoding)
>
> Monty
> 
> As an update, I have rebuilt from the Moose 6.0 download. The version of 
> XML-Parser in that was dated 18 July 2016 (configuration monty.233), so I 
> installed versions of XML-Parser-HTML and XML-Parser-StAX contemporary with 
> that. (The respective configurations are monty.48 and monty.39). With these 
> versions all my previous XMLHTMLParser operations work as before, and I have 
> been able to use the StAX parser in a simple way. So I can start exploring as 
> I intended.
> 
> I have made repeated attempts to update this rebuilt image to more recent 
> versions of the HTML and StAX parsers, and every time I run into the same 
> error reported below. I started from the latest version and worked backwards, 
> but gave up quickly; it takes about 6 minutes on my machine to load and 
> compile a version, and it soon gets tedious. If I feel more enthusiastic 
> tomorrow, I might start working forwards from my current versions.
> 
> Anyway, I now have a working system with the StaX and HTML parsers, so I can 
> continue to explore.
> 
> Best wishes
> 
> Peter Kenny
> 
> -Original Message-
> From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
> PBKResearch
> Sent: 15 May 2017 20:44
> To: 'Any question about pharo is welcome' <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] [Zinc] ZnInvalidUTF8: Illegal leading byte for 
> utf-8 encoding
> 
> Monty
> 
> I have just started trying to use the StAX parsers, and I have found that the 
> update has introduced a problem, which means that XMLHTMLParser no longer 
> works on examples I have used before. I updated to 
> ConfigurationOfXMLParser(monty.302), which is the latest version on the 
> smalltalkhub repository, and then used the load version in the class comment, 
> which loads the stable default. Similarly, I loaded 
> ConfigurationOfXMLParserHTML(monty.62) and 
> ConfigurationOfXMLParserStAX(monty.51), again using stable and default. When 
> I try to run the XMLHTMLParser example I quoted below, I get an error message 
> 'MessageNotunderstood: receiver of "critical:" is nil'. The same message 
> comes up with anything else I try with XMLHTMLParser or with StAXHTMLParser.
> 
> I am not really up to using the debugger on someone else's code, but the one 
> thing I can see is that the problem lies in XMLKeyValueCache>>critical:, 
> which has the code:
> ^ self mutex critical: aBlock
> The problem being that mutex is nil. 
> 
> In my enthusiasm, I saved the updated image with the same name as the old 
> image, which is now therefore overwritten. If I cannot solve this problem, my 
> only way out is to rebuild my image from the Moose 6.0 download. Any 
> suggestions gratefully received.
> 
> Thanks in advance
> 
> Peter Kenny
> 
> -Original Message-
> From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
> PBKResearch
> Sent: 15 May 2017 19:16
> To: 'Any question about pharo is welcome' <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] [Zinc] ZnInvalidUTF8: Illegal leading byte for 
> utf-8 encoding
> 
> Monty
> 
> Many thanks for this. My original purpose was just to answer Paul 
> deBruicker's query, namely to parse an html file and stop reading at the end 
> of the  section. I solved this by trial and error using the code shown 
> below ( which actually stops at the opening tag of the body). This was not my 
> problem at all, but Paul's; I just tackled it for fun

Re: [Pharo-users] [Zinc] ZnInvalidUTF8: Illegal leading byte for utf-8 encoding

2017-05-15 Thread monty

For that kind of incremental parsing, you could also use XMLParserStAX, a 
pull-parser that parses a document as a stream of event objects you control 
with #next, #peek, and #atEnd. It also supports pull-DOM parsing with messages 
like #nextNode, #nextElement, and #nextElementNamed:, which return the next 
event object(s) as DOM subtrees (searchable with XPath). See the StAXParser 
class comment for an example. (The StAXHTMLParser class requires XMLParserHTML 
be installed to work.)

> Sent: Friday, May 12, 2017 at 5:30 AM
> From: PBKResearch 
> To: "'Any question about pharo is welcome'" 
> Subject: Re: [Pharo-users] [Zinc] ZnInvalidUTF8: Illegal leading byte for 
> utf-8 encoding
>
> With reference to Norbert's comment, there /may/ be an ambiguity about the
> word 'header' in Udo's reply. It could refer to the http HEAD section, in
> which case Norbert is of course right. It could also refer to the 
> section of the html file, which is part of the content of the http response.
> If it is the latter, this is similar to a question that Paul deBruicker
> posted last November ("[Pharo-users] ZnClient GET, but just the  content of
> the  tag?"). I tried the method I devised for Paul's case on Udo's
> problem website, and read the html header with no problem. Incidentally, the
> header includes 'charset=iso-8859-1', which does not agree with Sven's
> findings.
> 
> In case it is of interest, I used XMLHTMLParser to read and parse the
> header. Try the following in a Playground:
> 
> par := XMLHTMLParser onURL:
> 'http://www.darkreading.com/partner-perspectives/malwarebytes/locky-returns-
> with-a-new-(borrowed)-distribution-method/a/d-id/1328723'.
> par parseDocumentUntil: [|top|(top := par topNode) notNil and: [ top
> isElement and:[ top isNamed: 'body']]].
> par parsingResult findElementNamed: 'head'.
> 
> If you 'Do it and go', the full header appears. The way I get it to stop
> after the header may not be quite correct, because it uses
> XMLHTMLParser>>topNode, which is a private method. On the other hand, I
> can't see how to make the stop condition for
> XMLHTMLParser>>parseDocumentUntil: depend on the parsed results without
> using a private method.
> 
> Hope this is helpful
> 
> Peter Kenny
> 
> -Original Message-
> From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of
> Norbert Hartl
> Sent: 12 May 2017 08:04
> To: Any question about pharo is welcome 
> Subject: Re: [Pharo-users] [Zinc] ZnInvalidUTF8: Illegal leading byte for
> utf-8 encoding
> 
> Just to mention. If you are not interested in the content body you could do
> a HEAD request instead of GET. 
> 
> Norbert
> 
> > Am 11.05.2017 um 22:44 schrieb Udo Schneider
> :
> > 
> > Hi Sven,
> > 
> > that's perfect. To be honest I don't care about the content - I'm just
> parsing the header. And even if there is a wrong decoding in there... I can
> live with that.
> > 
> > Thank you very very much! For your help but also your stuff in general.
> > 
> > CU,
> > 
> > Udo
> > 
> > 
> >> Am 11/05/17 um 22:35 schrieb Sven Van Caekenberghe:
> >> Hi Udo,
> >>> On 11 May 2017, at 21:37, Udo Schneider 
> wrote:
> >>> 
> >>> All,
> >>> 
> >>> I'm hitting an error where fetching web content fails. The website does
> indeed use invalid characters.
> >>> 
> >>> The easiest way to reproduce:
> >>> 
> >>> ZnEasy get:
> 'http://www.darkreading.com/partner-perspectives/malwarebytes/locky-returns-
> with-a-new-(borrowed)-distribution-method/a/d-id/1328723'
> >>> 
> >>> Is there any way to tell Zinc to simply ignore that error and to
> continue?
> >>> 
> >>> CU,
> >>> 
> >>> Udo
> >> That server/page has a mime-type text/plain with no explicit encoding
> (charset) setting, so we have to guess. Like utf-8, pure latin1/iso88591
> does not work. The following does work, but you can't be sure everything
> went well (beLenient takes some bytes as they are).
> >> ZnDefaultCharacterEncoder
> >>   value: ZnCharacterEncoder latin1 beLenient
> >>   during: [
> >> ZnClient new
> >>   get:
> 'http://www.darkreading.com/partner-perspectives/malwarebytes/locky-returns-
> with-a-new-(borrowed)-distribution-method/a/d-id/1328723';
> >>   yourself ].
> >> I added some API earlier today, so that the following should also work
> (you need to load Zn #bleedingEdge first).
> >>  ZnClient new
> >>   defaultEncoder: ZnCharacterEncoder latin1 beLenient;
> >>   get:
> 'http://www.darkreading.com/partner-perspectives/malwarebytes/locky-returns-
> with-a-new-(borrowed)-distribution-method/a/d-id/1328723';
> >>   yourself.
> >> HTH,
> >> Regards,
> >> Sven
> > 
> > 
> > 
> 
> 
> 
>

Re: [Pharo-users] [Zinc] ZnInvalidUTF8: Illegal leading byte for utf-8 encoding

2017-05-13 Thread monty

> Sent: Friday, May 12, 2017 at 5:30 AM
> From: PBKResearch 
> To: "'Any question about pharo is welcome'" 
> Subject: Re: [Pharo-users] [Zinc] ZnInvalidUTF8: Illegal leading byte for 
> utf-8 encoding
>
> With reference to Norbert's comment, there /may/ be an ambiguity about the
> word 'header' in Udo's reply. It could refer to the http HEAD section, in
> which case Norbert is of course right. It could also refer to the 
> section of the html file, which is part of the content of the http response.
> If it is the latter, this is similar to a question that Paul deBruicker
> posted last November ("[Pharo-users] ZnClient GET, but just the  content of
> the  tag?"). I tried the method I devised for Paul's case on Udo's
> problem website, and read the html header with no problem. Incidentally, the
> header includes 'charset=iso-8859-1', which does not agree with Sven's
> findings.
> 
> In case it is of interest, I used XMLHTMLParser to read and parse the
> header. Try the following in a Playground:
> 
> par := XMLHTMLParser onURL:
> 'http://www.darkreading.com/partner-perspectives/malwarebytes/locky-returns-
> with-a-new-(borrowed)-distribution-method/a/d-id/1328723'.
> par parseDocumentUntil: [|top|(top := par topNode) notNil and: [ top
> isElement and:[ top isNamed: 'body']]].
> par parsingResult findElementNamed: 'head'.
> 
> If you 'Do it and go', the full header appears. The way I get it to stop
> after the header may not be quite correct, because it uses
> XMLHTMLParser>>topNode, which is a private method. On the other hand, I
> can't see how to make the stop condition for
> XMLHTMLParser>>parseDocumentUntil: depend on the parsed results without
> using a private method.

There's always #document. But since I can't see any possible harm, I'll make it 
public.
 
> Hope this is helpful
> 
> Peter Kenny
> 
> -Original Message-
> From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of
> Norbert Hartl
> Sent: 12 May 2017 08:04
> To: Any question about pharo is welcome 
> Subject: Re: [Pharo-users] [Zinc] ZnInvalidUTF8: Illegal leading byte for
> utf-8 encoding
> 
> Just to mention. If you are not interested in the content body you could do
> a HEAD request instead of GET. 
> 
> Norbert
> 
> > Am 11.05.2017 um 22:44 schrieb Udo Schneider
> :
> > 
> > Hi Sven,
> > 
> > that's perfect. To be honest I don't care about the content - I'm just
> parsing the header. And even if there is a wrong decoding in there... I can
> live with that.
> > 
> > Thank you very very much! For your help but also your stuff in general.
> > 
> > CU,
> > 
> > Udo
> > 
> > 
> >> Am 11/05/17 um 22:35 schrieb Sven Van Caekenberghe:
> >> Hi Udo,
> >>> On 11 May 2017, at 21:37, Udo Schneider 
> wrote:
> >>> 
> >>> All,
> >>> 
> >>> I'm hitting an error where fetching web content fails. The website does
> indeed use invalid characters.
> >>> 
> >>> The easiest way to reproduce:
> >>> 
> >>> ZnEasy get:
> 'http://www.darkreading.com/partner-perspectives/malwarebytes/locky-returns-
> with-a-new-(borrowed)-distribution-method/a/d-id/1328723'
> >>> 
> >>> Is there any way to tell Zinc to simply ignore that error and to
> continue?
> >>> 
> >>> CU,
> >>> 
> >>> Udo
> >> That server/page has a mime-type text/plain with no explicit encoding
> (charset) setting, so we have to guess. Like utf-8, pure latin1/iso88591
> does not work. The following does work, but you can't be sure everything
> went well (beLenient takes some bytes as they are).
> >> ZnDefaultCharacterEncoder
> >>   value: ZnCharacterEncoder latin1 beLenient
> >>   during: [
> >> ZnClient new
> >>   get:
> 'http://www.darkreading.com/partner-perspectives/malwarebytes/locky-returns-
> with-a-new-(borrowed)-distribution-method/a/d-id/1328723';
> >>   yourself ].
> >> I added some API earlier today, so that the following should also work
> (you need to load Zn #bleedingEdge first).
> >>  ZnClient new
> >>   defaultEncoder: ZnCharacterEncoder latin1 beLenient;
> >>   get:
> 'http://www.darkreading.com/partner-perspectives/malwarebytes/locky-returns-
> with-a-new-(borrowed)-distribution-method/a/d-id/1328723';
> >>   yourself.
> >> HTH,
> >> Regards,
> >> Sven
> > 
> > 
> > 
> 
> 
> 
>

Re: [Pharo-users] XMLHTMLParser Entity Handling oddity

2017-05-06 Thread monty

Yes, but at this point it will probably be a booklet, like the Glorp and Smacc 
ones you posted.

> Sent: Saturday, May 06, 2017 at 6:19 AM
> From: "Stephane Ducasse" <stepharo.s...@gmail.com>
> To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] XMLHTMLParser Entity Handling oddity
> 
> Hi guys
>  
> It would be supercool to have a chapter on the XML package. 
> Does any of you have the knowledge to do it?
> I do not have it. 
>  
> Stef
>  
>  
> On Sat, May 6, 2017 at 9:51 AM, Udo Schneider 
> <udo.schnei...@homeaddress.de[mailto:udo.schnei...@homeaddress.de]> 
> wrote:Perfect! Thank you very very much!
> 
> Am 05/05/17 um 19:28 schrieb monty:
> 
>  This should be fixed now. Thanks for the bug report.
>  Sent: Wednesday, May 03, 2017 at 4:44 PM
> From: "Udo Schneider" 
> <udo.schnei...@homeaddress.de[mailto:udo.schnei...@homeaddress.de]>
> To: pharo-users@lists.pharo.org[mailto:pharo-users@lists.pharo.org]
> Subject: [Pharo-users] XMLHTMLParser Entity Handling oddity
> 
> All,
> 
> I'm hitting an interesting issue with XMLHTMLParser and I'm not even
> sure if this is a bug or intended behaviour. Given an HTML Entity in a
> String it's resolved or quoted depending on the tag (header or section tag):
> 
> doc := XMLHTMLParser parse:
> 'ÜÜ'.
> (doc findElementNamed: 'title') contentString. "'Ü'"
> (doc findElementNamed: 'body') contentString.  "'Ü'"
> 
> In my understanding and according to
> https://www.w3.org/TR/html401/struct/global.html#h-7.4.2[https://www.w3.org/TR/html401/struct/global.html#h-7.4.2]
>  Entities in the
> title tag are allowed and should IMHO be resolved.
> 
> So both should return 'Ü' in this case.
> 
> Any pointers?
> 
> CU,
> 
> Udo

Re: [Pharo-users] XMLHTMLParser Entity Handling oddity

2017-05-05 Thread monty

This should be fixed now. Thanks for the bug report.

> Sent: Wednesday, May 03, 2017 at 4:44 PM
> From: "Udo Schneider" 
> To: pharo-users@lists.pharo.org
> Subject: [Pharo-users] XMLHTMLParser Entity Handling oddity
>
> All,
> 
> I'm hitting an interesting issue with XMLHTMLParser and I'm not even 
> sure if this is a bug or intended behaviour. Given an HTML Entity in a 
> String it's resolved or quoted depending on the tag (header or section tag):
> 
> doc := XMLHTMLParser parse: 
> 'ÜÜ'.
> (doc findElementNamed: 'title') contentString. "'Ü'"
> (doc findElementNamed: 'body') contentString.  "'Ü'"
> 
> In my understanding and according to 
> https://www.w3.org/TR/html401/struct/global.html#h-7.4.2 Entities in the 
> title tag are allowed and should IMHO be resolved.
> 
> So both should return 'Ü' in this case.
> 
> Any pointers?
> 
> CU,
> 
> Udo
> 
> 
>

Re: [Pharo-users] best practices for using external files for testing

2017-04-18 Thread monty

XMLParser's XML-Tests-Conformance project, which is automatically generated 
from the W3C's Conformance Test Suites project (https://www.w3.org/XML/Test/), 
stores the contents of its files in class methods. This way it's 
self-contained, portable, and the actual files only need to be downloaded and 
unzipped to regenerate the TestCases.

Consider extracting it into a separate project with a separate CI job if it 
gets too big.

> Sent: Saturday, April 15, 2017 at 12:52 PM
> From: "Peter Uhnak" 
> To: pharo-users@lists.pharo.org
> Subject: [Pharo-users] best practices for using external files for testing
>
> Hi,
> 
> is there a common/best practice for using external files in tests?
> 
> In my specific case I am interested in git-based projects, where I have a big 
> (~1MB) file stored in repository and I would like to use it in my tests.
> 
> For GitFileTree project I could presumably use the following to access it:
> 
> 'OP-XMI' asPackage mcPackage workingCopy repositoryGroup remotes first 
> directory / 'tests' / 'my-test-file.xmi'
> 
> This will retrieve the MCPackage of the Package and then retireve where it 
> the repo is actually stored on the disk.
> 
> Are there better ways to do this? Could something similar be done with 
> IceBerg?
> 
> (p.s. in theory I could compile the entire file (e.g. 1MB) to a method, but 
> that is very ugly to me)
> 
> Thanks,
> Peter
> 
>

Re: [Pharo-users] PetitParser question parsing HTML meta tags

2017-04-02 Thread monty

XMLParserHTML is the fastest HTML parser on Pharo, Squeak, and GS. It has DOM 
and SAX parsers and works with other libs such as PharoExtras/XPath and 
PharoExtras/XMLParserStAX.

Element and attribute names are normalized to lowercase, and printing XML DOM 
trees back as HTML is complicated by browsers not recognizing XML-style 
self-closing tags ending with "/>" for some elements (like "script"), so use 
#printedWithoutSelfClosingTags/#printWithoutSelfClosingTagsOn:/#printWithoutSelfClosingTagsToFileNamed:
 instead.

> Sent: Thursday, March 30, 2017 at 1:58 PM
> From: "PAUL DEBRUICKER" 
> To: "Any question about pharo is welcome" 
> Subject: [Pharo-users] PetitParser question parsing HTML meta tags
>
> This is kind of a "I'm tired of thinking about this and not making much 
> progress for the amount of time I'm putting in question" but here it is: 
> 
> 
> 
> I'm trying to parse descriptions from HTML meta elements.  I can't use Soup 
> because there isn't a working GemStone port.  
> 
> I've got it to work with the structure:
> 
> 
> 
> and 
> 
> 
> 
> 
> but I'm running into instances of: 
> 
> 
> 
> and
> 
> 
> 
> 
> and am having trouble adapting my parsing code (such as it is). 
> 
> 
> The parsing code that addresses the first two cases is:
> 
> 
> 
> parseHtmlPageForDescription: htmlString
>   | startParser endParser ppStream descParser result text lower str 
> doubleQuoteIndex |
>   lower := 'escription' asParser.
>   startParser := '   endParser := '>' asParser.
>   ppStream := htmlString readStream asPetitStream.
>   descParser := ((#'any' asParser starLazy: startParser , lower)
> , (#'any' asParser starLazy: endParser)) ==> #'second'.
>   result := descParser parse: ppStream.
>   text := (result
> inject: (WriteStream on: String new)
> into: [ :stream :char | 
>   stream nextPut: char.
>   stream ])
> contents trimBoth.
>   str := text copyFrom: (text findString: 'content=') + 9 to: text size.
>   doubleQuoteIndex := 8 - ((str last: 7) indexOf: $").
>   ^ str copyFrom: 1 to: str size - doubleQuoteIndex
> 
> 
> I can't figure out how to change the startParser parser to accept the second 
> idiom.  And maybe there's a better approach altogether.  Anyway.  If anyone 
> has any ideas on different approaches I'd appreciate learning them.  
> 
> 
> Thanks for giving it some thought
> 
> Paul
>

Re: [Pharo-users] [ANN] Regex Tester Tool for Pharo

2017-03-01 Thread monty

Very nice!

> Sent: Wednesday, March 01, 2017 at 3:32 PM
> From: "Torsten Bergmann" 
> To: "Pharo Development List" , "Any question about 
> pharo is welcome" 
> Subject: [Pharo-users] [ANN] Regex Tester Tool for Pharo
>is a good idea
> Hi,
> 
> I wrote a little tool to test regular expressions and verify that
> given samples match it. It also helps to divide an expression into
> subexpressions to retrieve parts of a matched string.
> 
> Screenshot is attached. Code, load instructions and full tutorial 
> on how to use it is on https://github.com/astares/Pharo-Regex-Tools
> 
> Hope it is useful for others too. Have fun!
> 
> Bye
> T.

Re: [Pharo-users] [ANN] XML Metadata Interchange (XMI) for Pharo

2017-03-01 Thread monty

Is this based on Peter's work? Also:

"anXMLElement name xmlPrefixBeforeLocalName" -> "anXMLElement prefix"
"anXMLElement elements collect:" -> "anXMLElement elementsCollect:"

"^self fromXMLElement: (XMLDOMParser parse: aStringOrStream usingNamespaces: 
false) root"...why are you disabling namespaces? If it's for performance, also 
look at #optimizeForLargeDocuments. In fact it would good to browse the entire 
"configuring" category of XMLDOMParser and its superclass to see what you need.

> Sent: Wednesday, March 01, 2017 at 4:26 PM
> From: "Torsten Bergmann" 
> To: "Pharo Development List" , "Any question about 
> pharo is welcome" 
> Subject: [Pharo-users] [ANN] XML Metadata Interchange (XMI) for Pharo
>
> Hi,
> 
> if you work with UML, modeling tools or model data exchange often you might 
> know XMI - 
> the XML Metadata Interchange format. 
> 
> I wrote a little package that makes it easier to work and browse data based 
> on this 
> format within Pharo.
> 
> For instance you can open an XMI object either from a given stream 
> or URL:
> 
>   (XMIFile fromURL: 'http://www.omg.org/spec/UML/20131001/UML.xmi') inspect
> 
> You can also open a file
> 
>   XMIFile importFile
> 
> Code is available on https://github.com/astares/Pharo-XMI
> 
> The package includes a small GT extension allowing you to walk and dive 
> through the XMI
> structure (see attached screenshot) and as the XMI nodes are unified it is 
> easy to 
> code some code generators or transformers in Pharo afterwards.
> 
> For instance I used this package for a simple Pharo code generator based on 
> Eclipse EMF models/diagrams.
> 
> Within the code you will find some examples to browse through prominent XMI 
> models like:
> 
>  - UML Spec
>=> (XMIFile fromURL: 'http://www.omg.org/spec/UML/20131001/UML.xmi') 
> inspect
> 
>  - VISUAL PARADIGM
>=> (XMIFile fromURL: 
> 'https://raw.githubusercontent.com/staruml/XMI/master/unittest-files/VP_XMI21.xmi')
>  inspect
>  
>  - ENTERPRISE ARCHITECT
>=> (XMIFile fromURL: 
> 'https://raw.githubusercontent.com/staruml/XMI/master/unittest-files/EA_XMI21.xmi')
>  inspect
>  
>  - ARCHIMATE
>=> (XMIFile fromURL: 'https://www.reflektis.com/repos/archimate-3.0.xmi) 
> inspect
> 
> Once the inspector opens you have to select the "rootNode" and then the 
> "Hierarchy" tab.
> 
> In Pharo 6 you can load "XMI" right from catalog. Hope the package is useful 
> for others too.
> 
> Bye
> T.

Re: [Pharo-users] Coding XPath as Smalltalk

2017-02-20 Thread monty

An improved ?? construct is part of the public API now. It can be used to 
attach predicates to node tests:
xmlNode / 'foo' / ('bar' ?? 10).

or to filter result node sets like this:
(xmlNode / 'foo' / 'bar') ?? 10.
xmlNode / 'foo' / 'bar' ?? 10. "same because of precedence"

The first is equivalent to this:
'foo/bar[10]' asXPath in: xmlNode.

The others are equivalent to this:
'(foo/bar)[10]' asXPath in: xmlNode.

Block predicates take the context node, position, and size as optional (cull:) 
arguments. Multiple predicates are done by giving ?? an Array argument or 
chaining ?? sends like this:
xmlNode / 'foo' / ('bar' ?? [:each | each includesAttribute: 'name'] ?? 
1).

Number predicates are faster than blocks and an initial number predicate 
attached to a node test (like in the first example) optimizes the node test by 
limiting the number of matching nodes it enumerates. 

> Sent: Saturday, September 03, 2016 at 11:08 AM
> From: PBKResearch <pe...@pbkresearch.co.uk>
> To: "'Any question about pharo is welcome'" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] Coding XPath as Smalltalk
>
> Hi Monty
> 
> Just to say that I have obtained the Moose 6.0 image (Pharo5.0 Latest update: 
> #50761) and installed the XMLHTMLParser, and I seem able to reproduce the 
> nucleus of the results I had from my old image. Some of my old XPath strings 
> do not work (e.g. it did not recognise [1]), but I have worked my way round 
> that, and I should soon have worked out the new syntax. Thanks for your 
> suggestions for the variable case; I can now see the way ahead for that. 
> Thanks for all the help.
> 
> @stef
> 
> I have loaded the same version of TextLint as I had in the previous image. It 
> was accepted with no problems. I have tested it for the limited uses I make 
> of it (just the parsers, not the rules) and everything seems OK. So now I am 
> upgraded to Pharo 5, so far with no problems!
> 
> Peter Kenny
> 
> -Original Message-
> From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
> monty
> Sent: 03 September 2016 13:00
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] Coding XPath as Smalltalk
> 
> 
> 
> > Sent: Saturday, September 03, 2016 at 5:30 AM
> > From: PBKResearch <pe...@pbkresearch.co.uk>
> > To: "'Any question about pharo is welcome'" <pharo-users@lists.pharo.org>
> > Subject: Re: [Pharo-users] Coding XPath as Smalltalk
> >
> > Hi Monty
> > 
> > Many thanks. I have picked up a project that I had not worked on for a 
> > while, which explains why I am using an old image. I shall try the latest 
> > Moose image, as you suggest. My only anxiety is that I need to be able to 
> > use a rather ancient package called TextLint, and I do not know whether it 
> > will load OK in a new Pharo. If not, I shall try to update my existing 
> > image.
> 
> If you'd looked at CI job, you'd see that XPath builds on Pharo 5 through 3 
> (but should work back to 1.4). You can always start fresh with a clean, old 
> image from http://files.pharo.org/image/ or the Moose website if TextLint 
> doesn't work anymore.
> 
> > With the latest XPath, will it be clear how to use the binary syntax to 
> > carry out node tests like the example of '//div[@id=''catlinks'']//' that I 
> > cited below? The case I am interested in is where the actual identifier 
> > ('catlinks' in this case) is a variable rather than a constant. It would be 
> > possible to do it in standard XPath by assembling the XPath string with a 
> > variable component, but it might be more convenient in the binary syntax.
> > 
> 
> You could do this:
>  ((doc // 'div') select: [:each | (each attributeAt: 'id') = catlinks]) // 
> 'li' // 'text()'
> 
> where "catlinks" is a var. Or you could use xPath:context: with an XPath var 
> that you dynamically bind using custom contexts:
>  doc
>  xPath: '//div[@id=$catlinks]//li//text()'
>  context: (XPathContext variables: {'catlinks' -> catlinks})
> 
> The advantage over this:
>  doc xPath: '//div[@id=''', catlinks, ''']//li//text()'
> 
> is that the xPath: expression string is the same each time, so it's only 
> compiled once, the first time, and cached for later uses (inspect 'XPath 
> compiledXPathCache') instead of being compiled each time the xPath: 
> expression string arg changes.
> 
> > Many thanks for your help.
> > 
> > Peter Kenny
> > 
> > -Original Message-
> > From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
> > monty
> > Sent: 03 September 2016 06:54
>

Re: [Pharo-users] The Ultimate Smalltalk Tutorial

2016-10-24 Thread monty

+1 for PBE.


> Sent: Monday, October 24, 2016 at 1:56 AM
> From: "Nicolai Hess" 
> To: "Any question about pharo is welcome" 
> Subject: Re: [Pharo-users] The Ultimate Smalltalk Tutorial
>  
> Am 23.10.2016 3:16 nachm. schrieb "Vitor Medina Cruz" 
> :
> >
> > I think the MOOC is too much for a tutorial. What I miss today is a good 
> > written (no videos! Please!) tutorial that teaches just a little of the 
> > language and give a few guidelines on how to do simple stuff with the 
> > environment, such as a "Hello World!", creating a class, tests and run 
> > stuff. 
> I thought "pharo by example" provides exactly  that.
> What is missing here, from your perspective?
> I learned a lot from it and it helped me to get started to learn smalltalk, 
> not only the syntax, but also, doing something the smalltalk way.
> >
> > On Sat, Oct 15, 2016 at 12:15 PM, horrido 
> >  wrote:
> >>
> >> Excellent suggestion! I shall look into it. Thanks.
> >>
> >>
> >>
> >>
> >> --
> >> View this message in context: 
> >> http://forum.world.st/The-Ultimate-Smalltalk-Tutorial-tp4918859p4918930.html[http://forum.world.st/The-Ultimate-Smalltalk-Tutorial-tp4918859p4918930.html]
> >> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
> >>
> >

Re: [Pharo-users] Coding XPath as Smalltalk

2016-09-03 Thread monty

Thanks!

> Sent: Saturday, September 03, 2016 at 4:31 AM
> From: "Sven Van Caekenberghe" <s...@stfx.eu>
> To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] Coding XPath as Smalltalk
>
> 
> > On 03 Sep 2016, at 08:17, Tudor Girba <tu...@tudorgirba.com> wrote:
> > 
> > Hi,
> > 
> > Indeed, Monty is doing a great job at maintaining and evolving the XML 
> > support.
> 
> Yes indeed !
> 
> > Cheers,
> > Doru
> > 
> > 
> >> On Sep 3, 2016, at 8:06 AM, Hernán Morales Durand 
> >> <hernan.mora...@gmail.com> wrote:
> >> 
> >> Thank you Monty for the clarification. I should say the original XPath 
> >> package was written by Phil Hargett and I just added a couple of methods. 
> >> Glad you rewrote the lib!
> >> Cheers,
> >> 
> >> Hernán
> >> 
> >> 
> >> 2016-09-03 3:01 GMT-03:00 monty <mon...@programmer.net>:
> >> 
> >> Hernan, the PharoExtras/XPath repo has a major rewrite of your package to 
> >> support all of XPath 1.0 + XPath 2.0 extensions like the element() and 
> >> attribute() type tests and namespace literals in name tests like 
> >> '{namespaceURI}localName'. A rewrite was needed because the old lib only 
> >> implemented a small subset of the spec and would infinite loop on some 
> >> inputs.
> >> 
> >> Sent: Thursday, September 01, 2016 at 3:56 PM
> >> From: "Hernán Morales Durand" <hernan.mora...@gmail.com>
> >> 
> >> To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> >> Subject: Re: [Pharo-users] Coding XPath as Smalltalk
> >> 
> >> 
> >> 2016-09-01 16:51 GMT-03:00 PBKResearch <pe...@pbkresearch.co.uk>:
> >> Hi Hernan
> >> 
> >> 
> >> I don’t understand your first question – I can’t see a connection between 
> >> SPARQL and what I am doing.
> >> 
> >> 
> >> 
> >> You could get the Wikitionary data by querying a SPARQL endpoint 
> >> http://wiktionary.dbpedia.org/sparql instead of scrapping web pages (which 
> >> seems more difficult)
> >> 
> >> 
> >> I downloaded XPath from http://smalltalkhub.com/mc/PharoExtras/XPath/. 
> >> However, I am probably using a somewhat out of date version; I downloaded 
> >> it about a year ago.
> >> 
> >> 
> >> 
> >> I don't know about that version. I copied an old version from SqueakSource 
> >> (with permission) and updated from time to time, but there is no much. 
> >> There is also a XPath2 repository which you may try.
> >> 
> >> Hernán
> >> 
> >> 
> >> Peter
> >> 
> >> 
> >> From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf 
> >> Of Hernán Morales Durand
> >> Sent: 01 September 2016 18:54
> >> To: Any question about pharo is welcome <pharo-users@lists.pharo.org>
> >> Subject: Re: [Pharo-users] Coding XPath as Smalltalk
> >> 
> >> 
> >> Hi Peter,
> >> 
> >> 
> >> 2016-09-01 10:26 GMT-03:00 PBKResearch <pe...@pbkresearch.co.uk>:
> >> 
> >> Hello
> >> 
> >> 
> >> I am using XPath as a way of dissecting web pages, especially from 
> >> Wiktionary.
> >> 
> >> 
> >> Any specific reason to not use the SPARQL endpoint?
> >> 
> >> 
> >> 
> >> 
> >> Generally I get good results, but I could get useful extra flexibility by 
> >> using the binary Smalltalk operators to represent XPath, as mentioned at 
> >> the end of the class comment for XPath. However, the description there is 
> >> very terse, and I am having difficulty seeing how to include more complex 
> >> expressions, especially attribute tests.
> >> 
> >> 
> >> Which XPath version are you using? How did you installed it?
> >> 
> >> 
> >> 
> >> 
> >> I have put some of my XPath expressions through the XPath compiler and 
> >> looked at the output, and out of that I have found expressions which work 
> >> but look very clumsy. As an example, I have used the fragment:
> >> 
> >> 
> >> document xPath: '//div[@id=''catlinks'']//li//text()'
> >> 
> >> 
> >> and found that an equivalent is:
> >> 
> >> 
> >> document //'div' ?? [:node :x :y|(node attributeAt: 'id') = 
> >> 'catlinks']//'li'//[:n| n isStringNode]].
> >> 
> >> (I had to put two dummy arguments in the three-argument block to get it to 
> >> work.)
> >> 
> >> 
> >> Is there a more extensive explanation of the use of these binary 
> >> operators? If not, could some kind person show me the most concise 
> >> translation of the sample XPath above, to give me a start in working out 
> >> more complex cases?
> >> 
> >> 
> >> Many thanks for any help.
> >> 
> >> 
> >> Peter Kenny
> >> 
> >> 
> >> 
> >> 
> > 
> > --
> > www.tudorgirba.com
> > www.feenk.com 
> > 
> > “Live like you mean it."
> > 
> > 
> 
> 
>

Re: [Pharo-users] Coding XPath as Smalltalk

2016-09-03 Thread monty



> Sent: Saturday, September 03, 2016 at 5:30 AM
> From: PBKResearch <pe...@pbkresearch.co.uk>
> To: "'Any question about pharo is welcome'" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] Coding XPath as Smalltalk
>
> Hi Monty
> 
> Many thanks. I have picked up a project that I had not worked on for a while, 
> which explains why I am using an old image. I shall try the latest Moose 
> image, as you suggest. My only anxiety is that I need to be able to use a 
> rather ancient package called TextLint, and I do not know whether it will 
> load OK in a new Pharo. If not, I shall try to update my existing image.

If you'd looked at CI job, you'd see that XPath builds on Pharo 5 through 3 
(but should work back to 1.4). You can always start fresh with a clean, old 
image from http://files.pharo.org/image/ or the Moose website if TextLint 
doesn't work anymore.

> With the latest XPath, will it be clear how to use the binary syntax to carry 
> out node tests like the example of '//div[@id=''catlinks'']//' that I cited 
> below? The case I am interested in is where the actual identifier ('catlinks' 
> in this case) is a variable rather than a constant. It would be possible to 
> do it in standard XPath by assembling the XPath string with a variable 
> component, but it might be more convenient in the binary syntax.
> 

You could do this:
 ((doc // 'div') select: [:each | (each attributeAt: 'id') = catlinks]) // 'li' 
// 'text()'

where "catlinks" is a var. Or you could use xPath:context: with an XPath var 
that you dynamically bind using custom contexts:
 doc
 xPath: '//div[@id=$catlinks]//li//text()'
 context: (XPathContext variables: {'catlinks' -> catlinks})

The advantage over this:
 doc xPath: '//div[@id=''', catlinks, ''']//li//text()'

is that the xPath: expression string is the same each time, so it's only 
compiled once, the first time, and cached for later uses (inspect 'XPath 
compiledXPathCache') instead of being compiled each time the xPath: expression 
string arg changes.

> Many thanks for your help.
> 
> Peter Kenny
> 
> -Original Message-
> From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
> monty
> Sent: 03 September 2016 06:54
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] Coding XPath as Smalltalk
> 
> Peter, you're using an ancient version with bugs that were fixed last fall. 
> The newest version has more tests and correct behavior (checked against a 
> reference implementation). Just download a new Moose image and you'll get it, 
> along with an up to date XMLParser. (But if you insist on upgrading in your 
> old image, run "XPath initialize" after)
> 
> The binary syntax (there are keyword equivalents now) officially only 
> supports XPath axis selectors like #/ and #// that take node test arguments 
> where the node tests can be name tests like 'name,' '*', 'prefix:*' or type 
> tests like 'text()', 'comment()', 'element(name)'. 
> 
> Filters aren't officially supported with that syntax, but you can always use 
> select: on the result. ?? was removed, but I might add it back as shorthand. 
> Filters are implemented differently now.
> 
> > From: PBKResearch <pe...@pbkresearch.co.uk>
> > To: pharo-users@lists.pharo.org
> > Subject: [Pharo-users] Coding XPath as Smalltalk
> > 
> > Hello
> >  
> > I am using XPath as a way of dissecting web pages, especially from 
> > Wiktionary. Generally I get good results, but I could get useful extra 
> > flexibility by using the binary Smalltalk operators to represent XPath, as 
> > mentioned at the end of the class comment for XPath. However, the 
> > description there is very terse, and I am having difficulty seeing how to 
> > include more complex expressions, especially attribute tests. I have put 
> > some of my XPath expressions through the XPath compiler and looked at the 
> > output, and out of that I have found expressions which work but look very 
> > clumsy. As an example, I have used the fragment:
> >  
> > document xPath: '//div[@id=''catlinks'']//li//text()'
> >  
> > and found that an equivalent is:
> >  
> > document //'div' ?? [:node :x :y|(node attributeAt: 'id') = 
> > 'catlinks']//'li'//[:n| n isStringNode]].
> > (I had to put two dummy arguments in the three-argument block to get it to 
> > work.)
> >  
> > Is there a more extensive explanation of the use of these binary operators? 
> > If not, could some kind person show me the most concise translation of the 
> > sample XPath above, to give me a start in working out more complex cases?
> >  
> > Many thanks for any help.
> >  
> > Peter Kenny
> 
> 
>

Re: [Pharo-users] Coding XPath as Smalltalk

2016-09-03 Thread monty

> Sent: Saturday, September 03, 2016 at 2:02 AM
> From: stepharo <steph...@free.fr>
> To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] Coding XPath as Smalltalk
>
> Hi monty
> 
> In which repository this maintained version is?

PharoExtras/XPath (you gave me the write access).

 
> PharoExtras?
> 
> Is it the entry in the catalog?

It has a catalog entry at http://catalog.pharo.org and a CI job at 
https://ci.inria.fr/pharo-contribution/job/XPath/

> 
> Stef
> 
> 
> 
> Le 3/9/16 à 07:54, monty a écrit :
> > Peter, you're using an ancient version with bugs that were fixed last fall. 
> > The newest version has more tests and correct behavior (checked against a 
> > reference implementation). Just download a new Moose image and you'll get 
> > it, along with an up to date XMLParser. (But if you insist on upgrading in 
> > your old image, run "XPath initialize" after)
> >
> > The binary syntax (there are keyword equivalents now) officially only 
> > supports XPath axis selectors like #/ and #// that take node test arguments 
> > where the node tests can be name tests like 'name,' '*', 'prefix:*' or type 
> > tests like 'text()', 'comment()', 'element(name)'.
> >
> > Filters aren't officially supported with that syntax, but you can always 
> > use select: on the result. ?? was removed, but I might add it back as 
> > shorthand. Filters are implemented differently now.
> >
> >> From: PBKResearch <pe...@pbkresearch.co.uk>
> >> To: pharo-users@lists.pharo.org
> >> Subject: [Pharo-users] Coding XPath as Smalltalk
> >>
> >> Hello
> >>   
> >> I am using XPath as a way of dissecting web pages, especially from 
> >> Wiktionary. Generally I get good results, but I could get useful extra 
> >> flexibility by using the binary Smalltalk operators to represent XPath, as 
> >> mentioned at the end of the class comment for XPath. However, the 
> >> description there is very terse, and I am having difficulty seeing how to 
> >> include more complex expressions, especially attribute tests. I have put 
> >> some of my XPath expressions through the XPath compiler and looked at the 
> >> output, and out of that I have found expressions which work but look very 
> >> clumsy. As an example, I have used the fragment:
> >>   
> >> document xPath: '//div[@id=''catlinks'']//li//text()'
> >>   
> >> and found that an equivalent is:
> >>   
> >> document //'div' ?? [:node :x :y|(node attributeAt: 'id') = 
> >> 'catlinks']//'li'//[:n| n isStringNode]].
> >> (I had to put two dummy arguments in the three-argument block to get it to 
> >> work.)
> >>   
> >> Is there a more extensive explanation of the use of these binary 
> >> operators? If not, could some kind person show me the most concise 
> >> translation of the sample XPath above, to give me a start in working out 
> >> more complex cases?
> >>   
> >> Many thanks for any help.
> >>   
> >> Peter Kenny
> >
> 
> 
>

Re: [Pharo-users] Coding XPath as Smalltalk

2016-09-03 Thread monty

Hernan, the PharoExtras/XPath repo has a major rewrite of your package to support all of XPath 1.0 + XPath 2.0 extensions like the element() and attribute() type tests and namespace literals in name tests like '{namespaceURI}localName'. A rewrite was needed because the old lib only implemented a small subset of the spec and would infinite loop on some inputs.

Sent: Thursday, September 01, 2016 at 3:56 PM
From: "Hernán Morales Durand" 
To: "Any question about pharo is welcome" 
Subject: Re: [Pharo-users] Coding XPath as Smalltalk

2016-09-01 16:51 GMT-03:00 PBKResearch :

Hi Hernan

I don’t understand your first question – I can’t see a connection between SPARQL and what I am doing.

You could get the Wikitionary data by querying a SPARQL endpoint http://wiktionary.dbpedia.org/sparql instead of scrapping web pages (which seems more difficult)

I downloaded XPath from http://smalltalkhub.com/mc/PharoExtras/XPath/. However, I am probably using a somewhat out of date version; I downloaded it about a year ago.

I don't know about that version. I copied an old version from SqueakSource (with permission) and updated from time to time, but there is no much. There is also a XPath2 repository which you may try.

Hernán

Peter

From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of Hernán Morales Durand
Sent: 01 September 2016 18:54
To: Any question about pharo is welcome 
Subject: Re: [Pharo-users] Coding XPath as Smalltalk

Hi Peter,

2016-09-01 10:26 GMT-03:00 PBKResearch :

Hello

I am using XPath as a way of dissecting web pages, especially from Wiktionary.

Any specific reason to not use the SPARQL endpoint?

Generally I get good results, but I could get useful extra flexibility by using the binary Smalltalk operators to represent XPath, as mentioned at the end of the class comment for XPath. However, the description there is very terse, and I am having difficulty seeing how to include more complex expressions, especially attribute tests.

Which XPath version are you using? How did you installed it?

I have put some of my XPath expressions through the XPath compiler and looked at the output, and out of that I have found expressions which work but look very clumsy. As an example, I have used the fragment:

document xPath: '//div[@id=''catlinks'']//li//text()'

and found that an equivalent is:

document //'div' ?? [:node :x :y|(node attributeAt: 'id') = 'catlinks']//'li'//[:n| n isStringNode]].

(I had to put two dummy arguments in the three-argument block to get it to work.)

Is there a more extensive explanation of the use of these binary operators? If not, could some kind person show me the most concise translation of the sample XPath above, to give me a start in working out more complex cases?

Many thanks for any help.

Peter Kenny

Re: [Pharo-users] Coding XPath as Smalltalk

2016-09-02 Thread monty

Peter, you're using an ancient version with bugs that were fixed last fall. The 
newest version has more tests and correct behavior (checked against a reference 
implementation). Just download a new Moose image and you'll get it, along with 
an up to date XMLParser. (But if you insist on upgrading in your old image, run 
"XPath initialize" after)

The binary syntax (there are keyword equivalents now) officially only supports 
XPath axis selectors like #/ and #// that take node test arguments where the 
node tests can be name tests like 'name,' '*', 'prefix:*' or type tests like 
'text()', 'comment()', 'element(name)'. 

Filters aren't officially supported with that syntax, but you can always use 
select: on the result. ?? was removed, but I might add it back as shorthand. 
Filters are implemented differently now.

> From: PBKResearch 
> To: pharo-users@lists.pharo.org
> Subject: [Pharo-users] Coding XPath as Smalltalk
> 
> Hello
>  
> I am using XPath as a way of dissecting web pages, especially from 
> Wiktionary. Generally I get good results, but I could get useful extra 
> flexibility by using the binary Smalltalk operators to represent XPath, as 
> mentioned at the end of the class comment for XPath. However, the description 
> there is very terse, and I am having difficulty seeing how to include more 
> complex expressions, especially attribute tests. I have put some of my XPath 
> expressions through the XPath compiler and looked at the output, and out of 
> that I have found expressions which work but look very clumsy. As an example, 
> I have used the fragment:
>  
> document xPath: '//div[@id=''catlinks'']//li//text()'
>  
> and found that an equivalent is:
>  
> document //'div' ?? [:node :x :y|(node attributeAt: 'id') = 
> 'catlinks']//'li'//[:n| n isStringNode]].
> (I had to put two dummy arguments in the three-argument block to get it to 
> work.)
>  
> Is there a more extensive explanation of the use of these binary operators? 
> If not, could some kind person show me the most concise translation of the 
> sample XPath above, to give me a start in working out more complex cases?
>  
> Many thanks for any help.
>  
> Peter Kenny

Re: [Pharo-users] Set on Attribute

2016-08-25 Thread monty

Other than PluggableSet, something like Unix's sort and uniq combo would work:

condenseOnFirst: aCollection
| lastSelected |

^ aCollection sorted select: [:each |
(lastSelected isNil
or: [lastSelected first ~= each first])
ifTrue: [
lastSelected := each.
true]
ifFalse: [false]]



> Sent: Thursday, August 25, 2016 at 7:42 PM
> From: "Sean P. DeNigris" 
> To: pharo-users@lists.pharo.org
> Subject: [Pharo-users] Set on Attribute
>
> Say I have a collection like #('ab' 'ac' 'ba' bc') and I want to condense it
> so that a certain attribute is unique. In this example, say the first
> character, so I want one object where the first character is $a and one $b,
> but I don't care which object i.e. 'ab' or 'ac' for $a, but not both.
> 
> Is there an elegant way to do that?
> 
> 
> 
> -
> Cheers,
> Sean
> --
> View this message in context: 
> http://forum.world.st/Set-on-Attribute-tp4912672.html
> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
> 
>

Re: [Pharo-users] Generating custom classes based on attributes from XML Document

2016-08-17 Thread monty

You now can match on attributes too. The attributes: arguments can be any 
attribute specification object or just an ordinary dict or other association 
collection (nil values mean a key just must be present with any value).
 
And there is visitor pattern support so you don't have to define polymorphic 
extension methods on every node class or use a bunch of isWhatever testing 
messages when tree traversing.

> Sent: Friday, March 18, 2016 at 10:17 AM
> From: "Peter Uhnák" 
> To: "Pharo Users List" 
> Subject: [Pharo-users] Generating custom classes based on attributes from XML 
> Document
> 
> I have a XML like this 
>  
> 
>  isAbstract="true">
> 
> 
> 
> 
> 
>  
> and I would like to generate UmlClass and UmlAssociation classes for this.
>  
> I could use XMLPluggableElementFactory, however that only allows me to 
> specify the target class only on the element's name, such as
>  
> 
> doc := (XMLDOMParser on: someXML)
> nodeFactory:
> (XMLPluggableElementFactory new
> elementClass: GenericElement;
> handleElement: 'packagedElement' withClass: UmlPackagedElement)
> parseDocument.
>  
> However I need better granularity, because I would like to have a different 
> class for 'packagedElement[xmi:type="uml:Class"]' and different class for 
> 'packagedElement[xmi:type="uml:Association"]'.
>  
> Of course I could use double visitors (so use the Pluggable stuff to get 
> UmlPackagedElement) and then make another pass on it to get the final classes,
>  
> however I would prefer to have to wanted class directly without the need to 
> go through intermediate representation.
>  
> Is there a way to do this?
>  
> Thanks,
> Peter
>

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread monty

Also #parseURL:/#onURL: will use WebClient on Squeak (unless Zinc is present of 
course)

> Sent: Thursday, July 28, 2016 at 6:15 PM
> From: monty <mon...@programmer.net>
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”
>
> Good for finding one of the fixes, but please use #parseURL:/#onURL: instead 
> of #asUrl/#asZnUrl with #retrieveContents, because that will result in Zinc 
> eagerly decoding the response without looking at the  declaration as 
> the XML spec requires.
> 
> #parseURL:/#onURL: use Zinc correctly, doing their own XML-aware encoding on 
> top of it.
> 
> > Sent: Thursday, July 28, 2016 at 5:29 PM
> > From: "Sven Van Caekenberghe" <s...@stfx.eu>
> > To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> > Subject: Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”
> >
> > In my older work image, the following just works:
> > 
> > XMLDOMParser parse:
> > ('http://forum.world.st/file/n4908531/illegal-UTF-sms.xml' asUrl 
> > retrieveContents).
> > 
> > But I guess that is because my (older) XML parser version ignores the 
> > encoding, or is more lenient.
> > 
> > You could try to edit the incoming file, or have a look at 
> > #decodesCharacters: 
> > 
> > (XMLDOMParser on:
> > ('http://forum.world.st/file/n4908531/illegal-UTF-sms.xml' asUrl 
> > retrieveContents) readStream) decodesCharacters: false; parseDocument.
> > 
> > But I am no expert in the deeper aspects of XML Support.
> > 
> > > On 28 Jul 2016, at 22:29, Sean P. DeNigris <s...@clipperadams.com> wrote:
> > > 
> > > Sven Van Caekenberghe-2 wrote
> > >> Your XML file is not UTF-8 encoded, it is plain Unicode. At least the way
> > >> it is served from the URL you gave.
> > >> ..
> > >> You see ?
> > > 
> > > Unfortunately, no! ha ha. I didn't generate the file and I took it's
> > > assertion that it was UTF-8 at face value. How do I properly feed the file
> > > into XMLParser?
> > > 
> > > 
> > > 
> > > -
> > > Cheers,
> > > Sean
> > > --
> > > View this message in context: 
> > > http://forum.world.st/XMLParser-Claims-U-00A0-is-Invalid-UTF-8-tp4908525p4908539.html
> > > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
> > > 
> > 
> > 
> >
> 
>

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread monty

Good for finding one of the fixes, but please use #parseURL:/#onURL: instead of 
#asUrl/#asZnUrl with #retrieveContents, because that will result in Zinc 
eagerly decoding the response without looking at the  declaration as 
the XML spec requires.

#parseURL:/#onURL: use Zinc correctly, doing their own XML-aware encoding on 
top of it.

> Sent: Thursday, July 28, 2016 at 5:29 PM
> From: "Sven Van Caekenberghe" 
> To: "Any question about pharo is welcome" 
> Subject: Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”
>
> In my older work image, the following just works:
> 
> XMLDOMParser parse:
> ('http://forum.world.st/file/n4908531/illegal-UTF-sms.xml' asUrl 
> retrieveContents).
> 
> But I guess that is because my (older) XML parser version ignores the 
> encoding, or is more lenient.
> 
> You could try to edit the incoming file, or have a look at 
> #decodesCharacters: 
> 
> (XMLDOMParser on:
> ('http://forum.world.st/file/n4908531/illegal-UTF-sms.xml' asUrl 
> retrieveContents) readStream) decodesCharacters: false; parseDocument.
> 
> But I am no expert in the deeper aspects of XML Support.
> 
> > On 28 Jul 2016, at 22:29, Sean P. DeNigris  wrote:
> > 
> > Sven Van Caekenberghe-2 wrote
> >> Your XML file is not UTF-8 encoded, it is plain Unicode. At least the way
> >> it is served from the URL you gave.
> >> ..
> >> You see ?
> > 
> > Unfortunately, no! ha ha. I didn't generate the file and I took it's
> > assertion that it was UTF-8 at face value. How do I properly feed the file
> > into XMLParser?
> > 
> > 
> > 
> > -
> > Cheers,
> > Sean
> > --
> > View this message in context: 
> > http://forum.world.st/XMLParser-Claims-U-00A0-is-Invalid-UTF-8-tp4908525p4908539.html
> > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
> > 
> 
> 
>

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread monty

You're double decoding. Use onFileNamed:/parseFileNamed: instead (and the DOM 
printToFileNamed: family of messages when writing) and let XMLParser take care 
this for you, or disable XMLParser decoding before parsing with 
#decodesCharacters:.

Longer explanation:

The class #on:/#parse: take either a string or a stream (read the definitions). 
You gave it a FileReference, but because the argument is tested with isString 
and sent #readStream otherwise, it didn't blowup then.

File refs sent #readStream return file streams that do automatic decoding. But 
XMLParser automatically attempts its own decoding too, if:

 The input starts with a BOM or it can be inferred by null bytes before or 
after the first non-null byte.

 There is an encoding declaration with a non-UTF-8 encoding.

 There is a UTF-8 encoding declaration but the stream is not a normal 
ReadStream (your case).

So it gets decoded twice, and the decoded value of the char causes the error. 
I'll consider changing the heuristic to make less eager to decode.

> Sent: Thursday, July 28, 2016 at 4:05 PM
> From: "Sean P. DeNigris" <s...@clipperadams.com>
> To: pharo-users@lists.pharo.org
> Subject: Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”
>
> monty-3 wrote
> > Just to be sure, I manually recreated your file (with the great Bless hex
> > editor) and parsed it with no issue.
> 
> Thanks!
> 
> 
> monty-3 wrote
> > Please post your code and attach the actual source as a file separately.
> 
> The code is merely:
>   messageLog := FileLocator home / 'illegal-UTF-sms.xml'. 
>   doc := XMLDOMParser parse: messageLog.
> 
> File:  illegal-UTF-sms.xml
> <http://forum.world.st/file/n4908531/illegal-UTF-sms.xml>  
> 
> 
> 
> -
> Cheers,
> Sean
> --
> View this message in context: 
> http://forum.world.st/XMLParser-Claims-U-00A0-is-Invalid-UTF-8-tp4908525p4908531.html
> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
> 
>

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread monty

Just to be sure, I manually recreated your file (with the great Bless hex 
editor) and parsed it with no issue.

Please post your code and attach the actual source as a file separately.

> Sent: Thursday, July 28, 2016 at 3:12 PM
> From: "Sean P. DeNigris" 
> To: pharo-users@lists.pharo.org
> Subject: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”
>
> Posted to StackOverflow
> (https://stackoverflow.com/questions/38645553/xmlparser-in-pharo-claims-u00a0-is-invalid-utf-8):
> 
> 
> 
> Given the input:
> 
> 
> 
> 
> Where the character after the "." in the body attribute of the sms tag is
> U+00A0;
> 
> I get the error:
> 
> XMLEncodingException: Invalid UTF-8 character encoding (line 2) (column
> 13)
> 
> IIUC, the UTF-8 representation of that character is 0xC2 0xA0 per Wikipedia.
> Sure enough, bytes 72 and 73 of the input are 194 and 160 respectively.
> 
> This seems like a bug in XMLParser, or am I missing something?
> 
> 
> 
> 
> -
> Cheers,
> Sean
> --
> View this message in context: 
> http://forum.world.st/XMLParser-Claims-U-00A0-is-Invalid-UTF-8-tp4908525.html
> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
> 
>

[Pharo-users] Some info (was: How to access XML tag name?)

2016-03-14 Thread monty

To handle character data (the "b" in "b"), use characters:. But know 
it's sent multiple times for an element if its character data is separated by 
other markup like 'bd' (the new comment explains this). 

SAX parsing usually ends up needing a stack to track elements and their 
character data and is tedious but more efficient than DOM.

You can't use XPath without DOM, but there's XMLParserStAX, a newer alternative 
to XMLPullParser that supports building partial DOM trees. Like XMLPullParser, 
it treats the document as a stream of event (StAX) objects you can "pull" from 
using a stream protocol (next, peek, atEnd, and others), but it also has 
messages like nextNode to construct a DOM tree out of the next event(s). You 
can use XPath on that.

Re: [Pharo-users] Writing a dictionary as XML?

2016-02-18 Thread monty


 

xml :=
    XMLWriter writeWith: [:writer |
        dict keysAndValuesDo: [:key :value |
            writer
                tag: key
                with: value]]
 

Sent: Wednesday, February 17, 2016 at 12:46 PM
From: "p...@highoctane.be" 
To: "Any question about pharo is welcome" 
Subject: [Pharo-users] Writing a dictionary as XML?

How can I write a dictionary (OrderPreservingDictionary) as an XML string?
 
XMLWriter is more like a step by step thing.
 
TIA
Phil

Re: [Pharo-users] XML verification using an external XSD file - possible yet in Pharo?

2016-02-04 Thread monty



> Sent: Thursday, February 04, 2016 at 4:31 AM
> From: stepharo <steph...@free.fr>
> To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
> Subject: Re: [Pharo-users] XML verification using an external XSD file - 
> possible yet in Pharo?
>
> Hi paul
> 
> monty told me that his XML parser is validating.

It is but only against DTDs at the moment, not XSDs or RELAX NG.

> Stef
> 
> Le 2/2/16 23:53, PAUL DEBRUICKER a écrit :
> > Hi -
> >
> > Is there a way to validate XML files with an external XML schema file (XSD 
> > file) in Pharo ?
> >
> > Something like xerces (http://xerces.apache.org/) or xmllint 
> > (http://xmlsoft.org/xmllint.html) provide?
> >
> >
> > Thanks
> >
> > Paul
> >
> 
> 
>

Re: [Pharo-users] String vs Symbol use cases

2015-09-17 Thread monty

Peter, the XPath lib was rewritten not long ago to provide full XPath 1.0 
support + extensions, so feel free to mail me with questions or bugs. The old 
lib didn't implement the whole spec and would crash or infinite loop on valid 
input, so I felt a rewrite was needed.

> myDocument xPath: #'entity/@name'

I think treating Symbols as Strings is a bad idea. On GS, Symbols aren't 
Strings, and #foo = 'foo' is false, so in the XPath lib itself (which now 
supports GS), I NEVER use Symbols as Strings. But on Squeak/Pharo, this won't 
hurt you. if you don't need porting, do what you like.

> myDocument / 'entity' @ 'name'

This is the "DSL" syntax.  There are binary messages for each XPath axis: // 
for "descendant", //~ for "descendant-or-self", ~ for "self" and more in the 
"enumerating - axis" category. The XPath compiler actually generates sends of 
these but with block arguments that don't need parsing (string arguments are 
treated as NameTests and can have wildcard or type tests like '*', '*:foo', or 
'text()'). Because the string args are parsed every time, using xPath: can be 
faster if you save the compiled XPath (like in an inst/class var: "savedXPath 
:= 'some/path' asXPath") and reuse it (with "aNode xPath: savedXPath"). There's 
also a global compiled XPath cache that's checked before compiling an 
expression, so xPath: can still be faster even if you don't bother saving.

Remember the xPath: usage gives access to full XPath syntax (not just axis and 
nametests), including predicates, functions, and variables. XPath is really a 
different language so mapping it all to a ST DSL is tricky. For example, XPath 
1.0 is weakly typed so "1" = 1 = "1.0" but clearly this is false in ST. Be 
aware of the differences when you go from one to the other.

Re: [Pharo-users] String vs Symbol use cases

2015-09-17 Thread monty


> Remember the xPath: usage gives access to full XPath syntax (not just axis 
> and nametests), including predicates, functions, and variables. XPath is 
> really a different language so mapping it all to a ST DSL is tricky. For 
> example, XPath 1.0 is weakly typed so "1" = 1 = "1.0" but clearly this is 
> false in ST. Be aware of the differences when you go from one to the other.

By this I mean XPath has many JavaScript style implicit conversions. for 
example, if an XPath predicate evaluates to a node set, it's converted to a 
boolean: true if non-empty, false otherwise. Collections in ST obviously aren't 
converted automatically to true or false when used with ifTrue:ifFalse:.

Re: [Pharo-users] String vs Symbol use cases

2015-09-17 Thread monty

Hernan maintained the old lib. The PharoExtras/XPath lib you get from the 
catalog or config browsers was rewritten and questions/bugs should be sent to 
me. Note it is incompatible with Pastell, so you can't have both in the same 
image.

> Sent: Thursday, September 17, 2015 at 4:07 PM
> From: "Alexandre Bergel" 
> To: "Any question about pharo is welcome" 
> Subject: Re: [Pharo-users] String vs Symbol use cases
>
> Ah okay!
> 
> Alexandre
> 
> 
> > On Sep 17, 2015, at 2:55 PM, Peter Uhnák  wrote:
> > 
> > You are here defining a small api to formulate xpath queries, which is 
> > great.
> > 
> > Nono, this is XPath library by Hernan (available from Catalog Browser), I'm 
> > just using it.
> 
> -- 
> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
> Alexandre Bergel  http://www.bergel.eu
> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
> 
> 
> 
> 
>

Re: [Pharo-users] XPath (was String vs Symbol use cases)

2015-09-17 Thread monty

I understand the problem and I'm studying possible fixes while reviewing the 
specs, but I need more time. if you could work around it for now, that would be 
good. I will update you by mail.

49 matches

Mail list logo