Web bots can ignore the robots.txt file, most scrapers would.
On Mar 13, 2013 4:59 PM, "Jen Rasmussen" <j...@cetaceasound.com> wrote:

> -----Original Message-----
> From: Dale H. Cook [mailto:radiot...@plymouthcolony.net]
> Sent: Wednesday, March 13, 2013 3:38 PM
> To: php-general@lists.php.net
> Subject: [PHP] Accessing Files Outside the Web Root
>
> Let me preface my question by noting that I am virtually a PHP novice.
> Although I am a long-time webmaster, and have used PHP for some years to
> give visitors access to information in my SQL database, this is my first
> attempt to use it for another purpose. I have browsed the mailing list
> archives and have searched online but have not yet succeeded in teaching
> myself how to do what I want to do. This need not provoke a lengthy
> discussion or involve extensive hand-holding - if someone can point to an
> appropriate code sample or online tutorial that might do the trick.
>
> I am the author of a number of PDF files that serve as genealogical
> reference works. My problem is that there are a number of sites which are
> posing as search engines and which display my PDF files in their entirety
> on
> their own sites. These pirate sites are not simply opening a window that
> displays my files as they appear on my site. They are using Google Docs to
> display copies of my files that are cached or stored elsewhere online. The
> proof of that is that I can modify one of my files and upload it to my
> site.
> The file, as seen on my site, immediately displays the modification. The
> same file, as displayed on the pirate sites, is unmodified and may remain
> unmodified for weeks.
>
> It is obvious that my files, which are stored under public_html, are being
> spidered and then stored or cached. This displeases me greatly. I want my
> files, some of which have cost an enormous amount of work over many years,
> to be available only on my site. Legitimate search engines, such as Google,
> may display a snippet, but they do not display the entire file - they link
> to my site so the visitor can get the file from me.
>
> A little study has indicated to me that if I store those files in a folder
> outside the web root and use PHP to provide access they will not be
> spidered. Writing a PHP script to provide access to the files in that
> folder
> is what I need help with. I have experimented with a number of code samples
> but have not been able to make things work. Could any of you point to code
> samples or tutorials that might help me? Remember that, aside from the code
> I have written to handle my SQL database I am a PHP novice.
>
> Dale H. Cook, Member, NEHGS and MA Society of Mayflower Descendants;
> Plymouth Co. MA Coordinator for the USGenWeb Project Administrator of
> http://plymouthcolony.net
>
>
> --
> PHP General Mailing List (http://www.php.net/) To unsubscribe, visit:
> http://www.php.net/unsub.php
>
>
> Have you tried keeping all of your documents in one directory and blocking
> that directory via a robots.txt file?
>
> Jen
>
>
>
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

Reply via email to