Re: NSXML Parsing Problem
> On Thu, 2010/03/25, Keary Suska wrote: >> Maybe a cool option for NSXML would be to be able to >> specify the & pound ; sequence and have it map it to >> whatever... > My XML is a little rusty but IIRC this is an XML issue, and > any XML parser would choke. You have to define (or perhaps > more properly "declare") every named entity other than the > pre-defined named entities such as >, < and > &. > > I believe you can use numeric references with impunity: > nnn; but make sure it jives with your character > encoding. Agreed. pound is defined in html 4 but not in xhtml, which has only pre-defined character references for amp, lt, gt, apos, and quot http://www.w3.org/TR/2006/REC-xml11-20060816/#intern-replacement But in the current mode, they strive to make it difficult to put the pieces together, though they may believe they are clearly doing so here http://www.w3.org/TR/2006/REC-xml11-20060816/#intern-replacement here http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-entexpand and here http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EntityValue But if you've told it you're using UTF-8 or UTF-16 it shouldn't need an ampersand escape, since the British pound sterling symbol is not otherwise used in XML itself; in which case you just use the Unicode character. But, if you want to be compatible with html 4 you've got to define that character reference. ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: NSXML Parsing Problem
On Mar 25, 2010, at 7:34 PM, Dave wrote: > I was wondering if changing the XML charset would solve the problem? From > searching the Web I think the problem could be that we are assuming UTF-8, I > was wondering if we changed it to one of the ISO char sets if this would > solve it. No, it has nothing to do with character set. It sounds like the file is ASCII. > Maybe a cool option for NSXML would be to be able to specify the & pound ; > sequence and have it map it to whatever……. Yes, that’s called a DTD in XML lingo. I don’t know if NSXMLParser supports those. Your documents would need to use some sort of DTD to be considered valid XML, due to the undefined entities. —Jens ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: NSXML Parsing Problem
On Mar 25, 2010, at 8:34 PM, Dave wrote: > Hi Jens, > > Thanks for taking the time to reply. We are a startup and basically just > trying to get thing going with what we have. I'm downloading the XML data via > a URL and I could just change the database and strip out the offending > characters. I was wondering if changing the XML charset would solve the > problem? From searching the Web I think the problem could be that we are > assuming UTF-8, I was wondering if we changed it to one of the ISO char sets > if this would solve it. > > Maybe a cool option for NSXML would be to be able to specify the & pound ; > sequence and have it map it to whatever... My XML is a little rusty but IIRC this is an XML issue, and any XML parser would choke. You have to define (or perhaps more properly "declare") every named entity other than the pre-defined named entities such as >, < and &. I believe you can use numeric references with impunity: nnn; but make sure it jives with your character encoding. HTH, Keary Suska Esoteritech, Inc. "Demystifying technology for your home or business" ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: NSXML Parsing Problem
Typically what i do is download the XML into a string ... then if there are special characters that i know about in advance, i can use string class methods to replace them in the string before passing off to the xml parser. just another option to consider. jack On Mar 25, 2010, at 10:34 PM, Dave wrote: Hi Jens, Thanks for taking the time to reply. We are a startup and basically just trying to get thing going with what we have. I'm downloading the XML data via a URL and I could just change the database and strip out the offending characters. I was wondering if changing the XML charset would solve the problem? From searching the Web I think the problem could be that we are assuming UTF-8, I was wondering if we changed it to one of the ISO char sets if this would solve it. Maybe a cool option for NSXML would be to be able to specify the & pound ; sequence and have it map it to whatever... Thanks again Dave On 25 Mar 2010, at 23:13, Jens Alfke wrote: On Mar 25, 2010, at 8:47 AM, Dave wrote: I am getting an error using NSXMLParser if it encounters a British Pound Sign - it's encoded as & pound ; (minus the spaces). Any idea on how to solve this?? Basic XML only defines a handful of character entities. The other common ones are part of HTML. Are you sure this document is valid XML? I'm more familiar with NSXMLDocument than NSXMLParser, so I'm not sure how you tell the latter to handle arbitrary character entities. Sorry I can't be more help. —Jens ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/intrntmn%40aol.com This email sent to intrn...@aol.com ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: NSXML Parsing Problem
Hi Jens, Thanks for taking the time to reply. We are a startup and basically just trying to get thing going with what we have. I'm downloading the XML data via a URL and I could just change the database and strip out the offending characters. I was wondering if changing the XML charset would solve the problem? From searching the Web I think the problem could be that we are assuming UTF-8, I was wondering if we changed it to one of the ISO char sets if this would solve it. Maybe a cool option for NSXML would be to be able to specify the & pound ; sequence and have it map it to whatever... Thanks again Dave On 25 Mar 2010, at 23:13, Jens Alfke wrote: On Mar 25, 2010, at 8:47 AM, Dave wrote: I am getting an error using NSXMLParser if it encounters a British Pound Sign - it's encoded as & pound ; (minus the spaces). Any idea on how to solve this?? Basic XML only defines a handful of character entities. The other common ones are part of HTML. Are you sure this document is valid XML? I'm more familiar with NSXMLDocument than NSXMLParser, so I'm not sure how you tell the latter to handle arbitrary character entities. Sorry I can't be more help. —Jens ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: NSXML Parsing Problem
On Mar 25, 2010, at 8:47 AM, Dave wrote: I am getting an error using NSXMLParser if it encounters a British Pound Sign - it's encoded as & pound ; (minus the spaces). Any idea on how to solve this?? Basic XML only defines a handful of character entities. The other common ones are part of HTML. Are you sure this document is valid XML? I'm more familiar with NSXMLDocument than NSXMLParser, so I'm not sure how you tell the latter to handle arbitrary character entities. Sorry I can't be more help. —Jens___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com