I think this is very important for dynamic site
developers to understand. I'm very interested in
learning more about this and I think we could all
benefit from anyone with solid search engine
experience.

I run a site with about 18,000 news articles. They are
stored in database and dynamically generated (some
template elements update weekly). Since these articles
are mostly static once published, I generate a
last-modified header using the article publish_date
(and zero's for the hour/min/sec). This last-modified
header is also used by the internal search engine
(ht://Dig) to make articles searchable by date. 

I'm finding that even though google indexes the site
daily and grabs stories for their news.google.com MANY
of my pages are not appearing in the google index. It
appears that these are not being updated in their
cache either (only a couple months of data to go on).
I'm quite knowledgable on search engine optimizing
etc. but this has me confused. 

To make sure that google re-indexes every month. I
have thought of sending a last modified header using
year/month/day of article and a random
hour/minute/second. but if this random
hour/month/second is "earlier" than the one already
indexed it does not get indexed? 

olinux


> On Wed, 28 May 2003 09:31:11 -0500, Jay Blanchard
> wrote:
> 
> >I wouldn't go as far as using the
> auto_prepend_file.
> 
> Neither would I in this case Jay.    It was simply
> an example of what
> could be done, not necessarily what SHOULD be done. 
> I did however, use
> auto_prepend_file in a .htaccess file for a somewhat
> similar case.  
> 
> I have a site with about 90 pseudo-static pages (the
> page is static but
> I use PHP to include the header and footer) and a
> handful of fully
> dynamic pages.  I REALLY want this site to be
> regularly updated in the
> search engines but, unfortunately, many search
> engines only spider
> pages that are "newer" than what they have in their
> database.  Since
> PHP is dynamic, it doesn't report a "Last-Modified"
> header so the
> search engine doesn't think anything has been
> updated.  Hence stale
> search engine results.
> 
> To force all of the pages (both pseudo-static and
> dynamic) to generate
> a "Last-Modified" header, I set up prepend.php
> script which is
> configured as a directory level (.htaccess) parm to
> auto_prepend_file.
> 
> Here is the content of prepend.php.....
> 
> 
> <?php 
> 
>   header( "Last-Modified: " . 
>     gmdate( "D, d M Y H:i:s", 
>        filemtime( $_SERVER['SCRIPT_FILENAME'] ) ) . 
>     " GMT" ); 
> 
> ?>
> 
> For my truly dynamic pages, I figured out that only
> the last call to
> header actually shows up in the "real" header that
> makes it to the
> browser (or search engine), so I can create a more
> unique
> "Last-Modified" header as part of the dynamic pages
> (like when the
> database is updated or whatever makes sense) and it
> will overwrite the
> automatically generated one.
> 

__________________________________
Do you Yahoo!?
Yahoo! Calendar - Free online calendar with sync to Outlook(TM).
http://calendar.yahoo.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to