Re: Converting from HTML

2008-10-19 Thread Drarok Ithaqua
Following that link and pasting the code, I find it doesn't work  
correctly. It's replacing amp; with a space!


All I'm trying to do is parse the RSS feed URL out of the head  
section of a document. I didn't think it'd be so difficult!


As this app is going to be used on a specific site, I think I'll just  
do manual string replacements for now, as it seems

the easiest solution.

On 18 Oct 2008, at 00:22, Aurora Phoenix wrote:

Hi DI... Depending on how heavy you need to understand the structure  
of the
HTML, a simple parse using string chopping/ranges might be  
sufficient OR
(personally I would prefer) use something like libxml2 / Xpath to  
grok the

input. Note simple resolution of entities might not be sufficient,
particularly because you seem to be grabbing URI/URL... If you are  
grabbing

URL with intent of passing them on to the URL Loading System (NSURL*,
NSHTTPURL*), you will on the back end also ensure that prior to  
passing the

string itself is URLEncoded (e.g, replacing spaces with %20 and such).

Someone else has posted the link to ThinkMac blog which has a  
snippet for

cheap char entity resolution in Objective C (
http://www.thinkmac.co.uk/blog/2005/05/removing-entities-from-html-in-cocoa 
.

html)


Cheers and good luck!


On 10/16/08 19:17 , Drarok Ithaqua [EMAIL PROTECTED] wrote:


Hi all, i'm trying to find a way to convert an HTML-originated URL
into one I can use in cocoa.

Example input: link type=application/rss+xml rel=alternate  
href=/

search/uniqueamp;stuffamp;here /

I know the URL that this data is fetched from, so I can prefix that  
to

achieve a full URL again, but I need to convert the amp;
into plain ampersands, but there could be all kinds of HTML  
characters

in there. Is there a category on NSString out there I could
use for this?

I read somewhere that I could use an NSAttributedString and
initWithHTML, but that leaves me with an empty string. I'm guessing
because it's
inside a head tag? Not sure.

I'm also open to using something more intelligent than my current
method of searching the string for link  to find the rss feed, if
there's perhaps an
easier way that would also convert the HTML characters for me. Maybe
webkit has something for me?

I look forward to your replies, and you have my thanks in advance.

 - Drarok
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/aurora.phoenix.draco%40gmail 
.

com

This email sent to [EMAIL PROTECTED]





___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Converting from HTML

2008-10-17 Thread Mike Abdullah

On 17 Oct 2008, at 03:17, Drarok Ithaqua wrote:

Hi all, i'm trying to find a way to convert an HTML-originated URL  
into one I can use in cocoa.


Example input: link type=application/rss+xml rel=alternate  
href=/search/uniqueamp;stuffamp;here /


I know the URL that this data is fetched from, so I can prefix that  
to achieve a full URL again, but I need to convert the amp;
into plain ampersands, but there could be all kinds of HTML  
characters in there. Is there a category on NSString out there I could

use for this?

Please, please, please don't try and append the unescaped string to  
the base URL. Cocoa has a full NSURL API for this sort of thing, so  
instead do:


NSURL *URL = [NSURL URLWithString:unescapedString  
relativeToURL:baseURL];


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Converting from HTML

2008-10-17 Thread Aurora Phoenix
Hi DI... Depending on how heavy you need to understand the structure of the
HTML, a simple parse using string chopping/ranges might be sufficient OR
(personally I would prefer) use something like libxml2 / Xpath to grok the
input. Note simple resolution of entities might not be sufficient,
particularly because you seem to be grabbing URI/URL... If you are grabbing
URL with intent of passing them on to the URL Loading System (NSURL*,
NSHTTPURL*), you will on the back end also ensure that prior to passing the
string itself is URLEncoded (e.g, replacing spaces with %20 and such).

Someone else has posted the link to ThinkMac blog which has a snippet for
cheap char entity resolution in Objective C (
http://www.thinkmac.co.uk/blog/2005/05/removing-entities-from-html-in-cocoa.
html)


Cheers and good luck!


On 10/16/08 19:17 , Drarok Ithaqua [EMAIL PROTECTED] wrote:

 Hi all, i'm trying to find a way to convert an HTML-originated URL
 into one I can use in cocoa.
 
 Example input: link type=application/rss+xml rel=alternate href=/
 search/uniqueamp;stuffamp;here /
 
 I know the URL that this data is fetched from, so I can prefix that to
 achieve a full URL again, but I need to convert the amp;
 into plain ampersands, but there could be all kinds of HTML characters
 in there. Is there a category on NSString out there I could
 use for this?
 
 I read somewhere that I could use an NSAttributedString and
 initWithHTML, but that leaves me with an empty string. I'm guessing
 because it's
 inside a head tag? Not sure.
 
 I'm also open to using something more intelligent than my current
 method of searching the string for link  to find the rss feed, if
 there's perhaps an
 easier way that would also convert the HTML characters for me. Maybe
 webkit has something for me?
 
 I look forward to your replies, and you have my thanks in advance.
 
   - Drarok
 ___
 
 Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
 
 Please do not post admin requests or moderator comments to the list.
 Contact the moderators at cocoa-dev-admins(at)lists.apple.com
 
 Help/Unsubscribe/Update your Subscription:
 http://lists.apple.com/mailman/options/cocoa-dev/aurora.phoenix.draco%40gmail.
 com
 
 This email sent to [EMAIL PROTECTED]


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Converting from HTML

2008-10-16 Thread Drarok Ithaqua
Hi all, i'm trying to find a way to convert an HTML-originated URL  
into one I can use in cocoa.


Example input: link type=application/rss+xml rel=alternate href=/ 
search/uniqueamp;stuffamp;here /


I know the URL that this data is fetched from, so I can prefix that to  
achieve a full URL again, but I need to convert the amp;
into plain ampersands, but there could be all kinds of HTML characters  
in there. Is there a category on NSString out there I could

use for this?

I read somewhere that I could use an NSAttributedString and  
initWithHTML, but that leaves me with an empty string. I'm guessing  
because it's

inside a head tag? Not sure.

I'm also open to using something more intelligent than my current  
method of searching the string for link  to find the rss feed, if  
there's perhaps an
easier way that would also convert the HTML characters for me. Maybe  
webkit has something for me?


I look forward to your replies, and you have my thanks in advance.

 - Drarok
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]


Re: Converting from HTML

2008-10-16 Thread Nathan Day

Try this

http://www.thinkmac.co.uk/blog/2005/05/removing-entities-from-html-in-cocoa.html 



On 17/10/2008, at 13:17 , Drarok Ithaqua wrote:

Hi all, i'm trying to find a way to convert an HTML-originated URL  
into one I can use in cocoa.


Example input: link type=application/rss+xml rel=alternate  
href=/search/uniqueamp;stuffamp;here /


I know the URL that this data is fetched from, so I can prefix that  
to achieve a full URL again, but I need to convert the amp;
into plain ampersands, but there could be all kinds of HTML  
characters in there. Is there a category on NSString out there I could

use for this?

I read somewhere that I could use an NSAttributedString and  
initWithHTML, but that leaves me with an empty string. I'm guessing  
because it's

inside a head tag? Not sure.

I'm also open to using something more intelligent than my current  
method of searching the string for link  to find the rss feed, if  
there's perhaps an
easier way that would also convert the HTML characters for me. Maybe  
webkit has something for me?


I look forward to your replies, and you have my thanks in advance.

- Drarok
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/nathan_day%40mac.com

This email sent to [EMAIL PROTECTED]


Nathan Day
[EMAIL PROTECTED]
http://homepage.mac.com/nathan_day/

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]