I second the recommendation of using QueryPath. I use it almost exclusively along with drupal_http_request, though I use curl only in a few places (if you use curl I recommend http://drupal.org/project/curl for a dependency check). I'd really recommend though creating a custom module that uses the above and then has your logic for filtering in it, I've done this for about a dozen modules now.
That said, there are some more modules available out there nowadays, such as using http://drupal.org/project/feeds_xpathparser with feeds http://drupal.org/project/feeds There are about a dozen more modules that will accomplish the goal though I haven't used them, but I went through and tried most of the methods out for some recent projects. Cheers, Kevin O'Brien Drupal Developer http://www.coderintherye.com 415-754-0112 On Tue, Nov 30, 2010 at 11:26 AM, <[email protected]> wrote: > Send development mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.drupal.org/mailman/listinfo/development > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of development digest..." > > > Today's Topics: > > 1. Drupal module for scraping information from an HTML/XML > document (James Benstead) > 2. Re: Drupal module for scraping information from an HTML/XML > document (John Fiala) > 3. Easter problem (?mon Tam?s) > 4. Re: Easter problem (Carl Wiedemann) > 5. Re: Easter problem ([email protected]) > 6. Re: Easter problem ([email protected]) > 7. Re: Easter problem ([email protected]) > 8. Re: Easter problem (Jennifer Hodgdon) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 30 Nov 2010 18:56:09 +0000 > From: James Benstead <[email protected]> > Subject: [development] Drupal module for scraping information from an > HTML/XML document > To: development <[email protected]> > Message-ID: > > <[email protected]<afhbkvyurzgwnb54z%[email protected]> > > > Content-Type: text/plain; charset="iso-8859-1" > > I've finally got round to doing some serious work on Drupalversity, an > open, > web-based Drupal education project I've had in mind for a year or so. > > People who use Drupalversity to learn have the option of adding Resources > to > the site - i.e., links to posts at Lullabot, Chapter3 etc that explain how > to do specific things with Drupal. A Resource is a custom content type that > includes a link to the resource and a text field containing a description > of > that resource. > > What I'd like to do once a Resource has been added to the site is to scrape > certain information from it: at this point I'm thinking the Title of the > page the link points to and the provider of the resource - e.g., which > Drupal shop originally created the resource. What's the best way to go > about > doing this? I'm pretty sure there's not a Drupal module that solves the > problem out of the box. > > So far I've considered: > > - http://drupal.org/project/querypath > - Drupal's built-in drupal_http_request() - > > http://api.drupal.org/api/drupal/includes--common.inc/function/drupal_http_request/6 > - curl > > Thanks, > > --Jim > -- > My IM and Skype details are at http://state68.com/contact > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://lists.drupal.org/pipermail/development/attachments/20101130/5600f1fe/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Tue, 30 Nov 2010 12:06:33 -0700 > From: John Fiala <[email protected]> > Subject: Re: [development] Drupal module for scraping information from > an HTML/XML document > To: [email protected] > Message-ID: > <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1 > > These days, if I'm going to be trying to extract data from html/xml, > I'd use querypath. Give it a try! > > On Tue, Nov 30, 2010 at 11:56 AM, James Benstead > <[email protected]> wrote: > > What I'd like to do once a Resource has been added to the site is to > scrape > > certain information from it: at this point I'm thinking the Title of the > > page the link points to and the provider of the resource - e.g., which > > Drupal shop originally created the resource. What's the best way to go > about > > doing this? I'm pretty sure there's not a Drupal module that solves the > > problem out of the box. > > -- > John Fiala > www.jcfiala.net > > > ------------------------------ > > Message: 3 > Date: Tue, 30 Nov 2010 20:14:04 +0100 > From: ?mon Tam?s <[email protected]> > Subject: [development] Easter problem > To: [email protected] > Message-ID: > > <[email protected]<aanlktikmkovkedks2fkwubhrq9snte6r0ix%[email protected]> > > > Content-Type: text/plain; charset="utf-8" > > Hello, > > I have the nameday module (http://drupal.org/project/nameday) and I get a > feature request for the Greek namedays. How I see it is based on the > Easter, > what is not an easy thing to count. > > Well, I want to find some algorithm for Easter, and similar days, what is > can be stored somehow. Maybe it should be a hook or some other think what > can be stored in database. > > > Thanks > > -- > ?mon Tam?s > Sitefejleszt? ?s programoz? > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://lists.drupal.org/pipermail/development/attachments/20101130/c81e61bf/attachment-0001.html > > ------------------------------ > > Message: 4 > Date: Tue, 30 Nov 2010 12:22:42 -0700 > From: Carl Wiedemann <[email protected]> > Subject: Re: [development] Easter problem > To: [email protected] > Message-ID: > <[email protected]> > Content-Type: text/plain; charset="iso-8859-2" > > Does this help? http://php.net/manual/en/function.easter-days.php > > On Tue, Nov 30, 2010 at 12:14 PM, ?mon Tam?s <[email protected]> wrote: > > > Hello, > > > > I have the nameday module (http://drupal.org/project/nameday) and I get > a > > feature request for the Greek namedays. How I see it is based on the > Easter, > > what is not an easy thing to count. > > > > Well, I want to find some algorithm for Easter, and similar days, what is > > can be stored somehow. Maybe it should be a hook or some other think what > > can be stored in database. > > > > > > Thanks > > > > -- > > ?mon Tam?s > > Sitefejleszt? ?s programoz? > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://lists.drupal.org/pipermail/development/attachments/20101130/55b0fb8a/attachment-0001.html > > ------------------------------ > > Message: 5 > Date: Tue, 30 Nov 2010 13:24:07 -0600 > From: "[email protected]" <[email protected]> > Subject: Re: [development] Easter problem > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset=UTF-8; format=flowed > > There's no need for a hook here at all. You can either code in the > algorithm for defining when Easter is (which sounds like it is in fact > rather complicated) or just pre-store know pre-calculated dates for it > for the next decade or so. (10 records, one per year; totally easy.) > > Both options are described here, including the different mechanisms for > defining when Easter is in different calendars: > > http://en.wikipedia.org/wiki/Easter#Date_of_Easter > > --Larry Garfield > > On 11/30/10 1:14 PM, ?mon Tam?s wrote: > > Hello, > > > > I have the nameday module (http://drupal.org/project/nameday) and I get > > a feature request for the Greek namedays. How I see it is based on the > > Easter, what is not an easy thing to count. > > > > Well, I want to find some algorithm for Easter, and similar days, what > > is can be stored somehow. Maybe it should be a hook or some other think > > what can be stored in database. > > > > > > Thanks > > > > -- > > ?mon Tam?s > > Sitefejleszt? ?s programoz? > > > > > ------------------------------ > > Message: 6 > Date: Tue, 30 Nov 2010 14:23:56 -0500 > From: [email protected] > Subject: Re: [development] Easter problem > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset="utf-8" > > You can google it, but I believe this is one of those things that cannot > be reduced to an equation or algorithm. It's something like the first > Sunday after the first full moon after the spring equinox. > > On 11/30/2010 02:14 PM, ?mon Tam?s wrote: > > Hello, > > > > I have the nameday module ( http://drupal.org/project/nameday) and I > > get a feature request for the Greek namedays. How I see it is based on > > the Easter, what is not an easy thing to count. > > > > Well, I want to find some algorithm for Easter, and similar days, what > > is can be stored somehow. Maybe it should be a hook or some other > > think what can be stored in database. > > > > > > Thanks > > > > -- > > ?mon Tam?s > > Sitefejleszt? ?s programoz? > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://lists.drupal.org/pipermail/development/attachments/20101130/38791578/attachment-0001.html > > ------------------------------ > > Message: 7 > Date: Tue, 30 Nov 2010 13:26:23 -0600 > From: "[email protected]" <[email protected]> > Subject: Re: [development] Easter problem > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset=ISO-8859-2; format=flowed > > The Calendar PHP module is not enabled by default in a stock PHP, so I > don't know that you can rely on it (unfortunately). It does have some > cool stuff in it, though. > > --Larry Garfield > > On 11/30/10 1:22 PM, Carl Wiedemann wrote: > > Does this help? http://php.net/manual/en/function.easter-days.php > > > > On Tue, Nov 30, 2010 at 12:14 PM, ?mon Tam?s <[email protected] > > <mailto:[email protected]>> wrote: > > > > Hello, > > > > I have the nameday module (http://drupal.org/project/nameday) and I > > get a feature request for the Greek namedays. How I see it is based > > on the Easter, what is not an easy thing to count. > > > > Well, I want to find some algorithm for Easter, and similar days, > > what is can be stored somehow. Maybe it should be a hook or some > > other think what can be stored in database. > > > > > > Thanks > > > > -- > > ?mon Tam?s > > Sitefejleszt? ?s programoz? > > > > > > > ------------------------------ > > Message: 8 > Date: Tue, 30 Nov 2010 11:21:08 -0800 > From: Jennifer Hodgdon <[email protected]> > Subject: Re: [development] Easter problem > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset=UTF-8; format=flowed > > http://php.net/manual/en/function.easter-date.php > > On 11/30/2010 11:14 AM, ?mon Tam?s wrote: > > I have the nameday module (http://drupal.org/project/nameday) and I get > a > > feature request for the Greek namedays. How I see it is based on the > Easter, > > what is not an easy thing to count. > > > > Well, I want to find some algorithm for Easter, and similar days, what is > > can be stored somehow. Maybe it should be a hook or some other think what > > can be stored in database. > > -- > Jennifer Hodgdon * Poplar ProductivityWare > www.poplarware.com > Drupal web sites and custom Drupal modules > > > > ------------------------------ > > -- > [ Drupal development list | http://lists.drupal.org/ ] > > End of development Digest, Vol 95, Issue 58 > ******************************************* >
