It's not at all that I think I'm the expert you've summoned in your
last post on this thread. As you've been running your own high-traffic
"althttpd" server for many years, maybe you should explain the
Internet to us.

I was doing some (limited) research about RFCs, recommendations from
the Apache web server documentation, and blog posts by web browser
developers. It's often stated that "If-None-Match" should be given
precedence over "If-Modified-Since", but I haven't come across any
"hard facts".

I've created a small PHP script to test HTTP caching with ETags.

===== etags.php =====
<?php
$m=isset($_GET['m']) ? intval($_GET['m']) : 0; if ($m<0 || $m>4) $m=0;
$lmt=1519776000; // Wed, 28 Feb 2018 00:00:00 GMT
$etag='caffee'; // fixed
// handle If-Modified-Since
if (($m==1 || $m>=2 && !isset($_SERVER['HTTP_IF_NONE_MATCH'])) &&
    isset($_SERVER['HTTP_IF_MODIFIED_SINCE']) &&
    strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE'])==$lmt) {
  header('Not Modified',true,304);
  exit;
}
// handle If-None-Match
if ($m>=3 && isset($_SERVER['HTTP_IF_NONE_MATCH']) &&
    $_SERVER['HTTP_IF_NONE_MATCH']=='"'.$etag.'"') {
  header('Not Modified',true,304);
  exit;
}
// generate new data
if ($m==2)
  $etag='--'; // none
else if ($m!=4)
  $etag=uniqid(); // random
if ($m!=2)
  header('ETag: "'.$etag.'"');
header('Last-Modified: '.gmdate('D, d M Y H:i:s',$lmt).' GMT'); // fixed
header('Cache-Control: must-revalidate, private');
header('Content-Type: text/plain; charset=utf-8');
echo
  "TEST HTTP CONDITIONAL REQUEST PRECEDENCE\n".
  "========================================\n\n".
  "• Always reply with fixed \"Last-Modified\" date header\n".
  "• Reply with random, fixed, or no ETag (see query parameters)\n\n".
  "• IF_MODIFIED_SINCE: always cache hit (send \"304/Not Modified\")\n".
  "• IF_NONE_MATCH: cache hit on ETag match, cache miss otherwise\n";
echo
  "\nQuery Parameters (Mode):\n\n".
  "• ?m=0: random ETag, no caching (default)\n".
  "• ?m=1: random ETag, favor IF_MODIFIED_SINCE (cache hit)\n".
  "• ?m=2: omit ETag, favor IF_MODIFIED_SINCE (cache hit)\n".
  "• ?m=3: random ETag, favor IF_NONE_MATCH (cache miss)\n".
  "• ?m=4: fixed ETag, favor IF_NONE_MATCH (cache hit)\n";
echo "\nScript State:\n\n";
echo "• Mode: $m\n";
echo "• Date: ".gmdate('D, d M Y H:i:s')." GMT\n";
echo "• ETag: $etag\n";
echo "\nRequest Headers:\n\n";
foreach ($_SERVER as $k=>$v)
  if (substr($k,0,5)=='HTTP_') echo "• $k: $v\n";
echo "\nResponse Headers:\n\n";
foreach (headers_list() as $k=>$v)
  echo "• $v\n";
?>
===== etags.php =====

The script always responds with a constant "Last-Modified" header, but
generates a random ETag on each run (there's also other options,
explained in the script output).

The idea is that whenever the ETag → "If-None-Match" route is taken,
the script output in the web browser is also updated, so it's obvious
that the client preferred "If-None-Match" over "If-Modified-Since".
The script should be reloaded with F5 and Ctrl+F5 (to disable caching)
to see the effects.

The clients I've tested all favor ETags over "If-Modified-Since", even
Internet Explorer 8 on Windows XP, so no harm here. But of course
"wget -N" without the extra Perl wrapper to add ETag support favors
"If-Modified-Since".

I see that client-server clock skews, and the low resolution (1
second) of the "Last-Modified" timestamp, even more so in conjunction
with the --mtime option to set arbitrary timestamps for unversioned
files, may cause havoc.

But maybe the above script is helpful to test more "exotic" clients on
the way to the decision to support ETags AND "Last-Modified" vs. ETags
only.

--Florian
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to