On 2008-01-11 23:24:57 +0000, Tim Bunce wrote:
> On Fri, Jan 11, 2008 at 05:07:10PM +0300, Tomash Brechko wrote:
> > On Fri, Jan 11, 2008 at 16:54:26 +0300, Tomash Brechko wrote:
> > > I'd love to coordinate F_COMPRESSED flag.  C::M and C::M::F currently
> > > use 0x2 (0x1 is F_STORABLE).
> > 
> > Thinking more about this, perhaps we may do the following.  As I
> > understand most client libraries do not export flags to the user, but
> > use them internally for bookkeeping.  There are 16/32 (which one?)
> 
> 16 bits, per 
> http://code.sixapart.com/svn/memcached/trunk/server/doc/protocol.txt
> 
>   <flags> is an arbitrary 16-bit unsigned integer (written out in
>   decimal) that the server stores along with the data and sends back
>   when the item is retrieved. Clients may use this as a bit field to
>   store data-specific information; this field is opaque to the server.
>   Note that in memcached 1.2.1 and higher, flags may be 32-bits, instead
>   of 16, but you might want to restrict yourself to 16 bits for
>   compatibility with older versions.
> 
> I wonder why that says "may". Does anyone know?

Maybe because during parsing the flags are temporarily stored in a
variable of type int. That's probably 32 bits, but the C standard
guarantees only 16 bits (15 actually - the text protocol says that the
flag is an "unsigned integer", but the variable is signed, so you can
only use the positive range).


> If it is now 32-bit then we'd know that the top 16 bits are very
> unlikely to be used at the moment and so we could adopt those
> for "informal standardisation" with little risk.
> 
> > flag bits total.  We may separate this space into three classes:
> > 
> >  1 common, shared among all clients.  F_COMPRESSED goes here, and we
> >    additionally agree that the compression algorithm is deflate
> >    (gzip).

Is this now common or do you just think it should be common? AFAICS
libmemcached doesn't treat any flag specially.


> >  2 common to the language family.  F_STORABLE goes here got Perl
> >    family.

Ok.

> >  3 common to the particular client family, i.e. private for internal
> >    client use.  Please put F_UTF8 here ;).

That means at least an API change for Cache::Memcached, which doesn't
currently expose the flags to the application. Might be a good thing.

> It's probably premature to get into this much detail. I will make one
> suggestion though:
> 
> Since information about utf8 encoding is likely to be of general use,
> I'd suggest using two bits in group #1:
> One to indicate the data is known to be utf8 encoded,

I fully agree with this. UTF-8 encoding is generally useful, so it
should go into class 1. In any case it is needed to correctly represent
perl scalars, so it needs to be at least in the perl section of class 2.
I see little reason to put it in class 3 - if I have to store that
information explicitely there are already other ways to do it.


> and another to indicate the client supports utf8 encoded data but that
> this data isn't.
> 
> Only if both bits are off would a client need to consider using a
> "treat as utf8 if it looks like utf8" heuristic.

I don't think using a heuristic here is a good idea. If the UTF8 flag is
off (and other type info flags like F_STORABLE are off, too), then it's
just an octet string and the interpretation is up to the application.
The application may know that this is UTF-8 encoded text and treat it as
such, or it might know that it's something else (e.g., some bitmap) and
treat it as such. I wouldn't want the library to convert from UTF-8 to
something else just because it happens to look like UTF-8. If 
application programmers want to use a heuristic, they should add it
themselves - at least then they know what they are doing.

        hp

-- 
   _  | Peter J. Holzer    | It took a genius to create [TeX],
|_|_) | Sysadmin WSR       | and it takes a genius to maintain it.
| |   | [EMAIL PROTECTED]         | That's not engineering, that's art.
__/   | http://www.hjp.at/ |    -- David Kastrup in comp.text.tex

Attachment: signature.asc
Description: Digital signature

Reply via email to