On Fri, Apr 25, 2008 at 7:33 AM, Gangadhara Prasad <[EMAIL PROTECTED]> wrote: > > Hi Experts, > > How can I pull data from other websites and get it stored in my database. > > I need to pull all categories,subcategories and urls in > http://www.winedirectory.org/ > > Thanks, > Gangadhar >
Hi Gangadhar, I've been doing work similar to this extracting events from an events guide. Firstly make sure you have the permission of the website who's information you are gathering. This is more respect than anything else, but some owners take plagiarism strongly. You might find that the owner is happy to provide you with a feed of the information or a one-off export from his database. My method has been to examine the page (load it in a browser and view source) and identify patterns that I can use to find the information. Look for common blocks of HTML, elements, classes and ids, things like that. Once you've done that you'll need to write a PHP script to fetch the webpage and identify those blocks for itself. I've found it useful to use strpos and substr to break down the page into a manageable chunk before resorting to regular expressions to get the actual detail from the page. Use functions to good effect, especially if you need to read multiple pages with similar information on them. Regards, Phill