On Fri, Apr 25, 2008 at 7:33 AM, Gangadhara Prasad
<[EMAIL PROTECTED]> wrote:
>
> Hi Experts,
>
>  How can I pull data from other websites and get it stored in my database.
>
>  I need to pull all categories,subcategories and urls in
> http://www.winedirectory.org/
>
>  Thanks,
>  Gangadhar
>

Hi Gangadhar,

I've been doing work similar to this extracting events from an events
guide.  Firstly make sure you have the permission of the website who's
information you are gathering.  This is more respect than anything
else, but some owners take plagiarism strongly.  You might find that
the owner is happy to provide you with a feed of the information or a
one-off export from his database.

My method has been to examine the page (load it in a browser and view
source) and identify patterns that I can use to find the information.
Look for common blocks of HTML, elements, classes and ids, things like
that.  Once you've done that you'll need to write a PHP script to
fetch the webpage and identify those blocks for itself.  I've found it
useful to use strpos and substr to break down the page into a
manageable chunk before resorting to regular expressions to get the
actual detail from the page.  Use functions to good effect, especially
if you need to read multiple pages with similar information on them.

Regards,

Phill

Reply via email to