Mark,

I agree that CF is currently ambiguous on this front, and I'm fine with improving definitions going forward, but 'no_unit' smacks of the classic 'this page intentionally left blank' found in government documents. I think it's overkill, as backward compatibility will pretty much require that having no units attribute be interpretable as having a units attribute saying 'no_unit'.

Grace and peace,

Jim

On 11/4/14, 11:38 AM, Hedley, Mark wrote:
Hello Jim

> A variable with no units attribute at all is also pretty unambiguously a marker for something that isn't intended to be a even a pure number.

If only this were the case.  CF conventions state that:
Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included.
http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#units

Thus, the absence of a unit is to be interpreted identically to a statement that
units = '1'

This is the current situation and it is likely that there is lots of data like this around.

> Do we really need something more than a disambiguation of units = '1' vs no units attribute present?

Yes, I think we do: this situation is not ambiguous in CF, they are the same thing.

What I believe we require is a udunits entity which is clearly 'there is no unit of measure here, this is not dimensioned and not dimensionless'

The udunits value
''
delivers this functionality (I think), but it does not read very well, hence my suggestion that we ask for a new entry in udunits,
'no_unit'
which is hopefully clear in its meaning and interpretation
and which behaves the same as '' : failing all udunits processing attempts and operating as 'not a unit'

all the best
mark

------------------------------------------------------------------------
*From:* CF-metadata [cf-metadata-boun...@cgd.ucar.edu] on behalf of Jim Biard [jbi...@cicsnc.org]
*Sent:* 31 October 2014 15:18
*To:* cf-metadata@cgd.ucar.edu
*Subject:* Re: [CF-metadata] string valued coordinates

Mark,

I'm not clear on what you are suggesting that udunits do with 'no_unit' or '?'.

I had thought that the desire was to be able to differentiate between a pure number (as you mention below) and a value (whether a string or a bit pattern) that should not be interpreted as any number at all.

As the situation stands, a units value of '1' is pretty unambiguously a marker for a pure number. We may need to modify docs to make this clearer, but I don't think that poses a problem. A variable with no units attribute at all is also pretty unambiguously a marker for something that isn't intended to be a even a pure number. Again, we may need to modify docs to make this clearer. Because these two concepts are somewhat conflated in the current documentation and usage (area_type being an example), there is the issue of other places where cleanup would be good going forward, but even if you have a units value of '1' on a non-number, it doesn't hurt anything in practice.

Do we really need something more than a disambiguation of units = '1' vs no units attribute present?

Grace and peace,

Jim

On 10/31/14, 11:04 AM, Hedley, Mark wrote:
Thank you for all the responses, it sounds like 'all of the above' is the preferred response to my suggestions of plausible next steps. I will pursue all of these.

Eizi's point about having no_unit in udunits is sound; I suggest we request udunits use
  'no_unit'
as a representation of
'?'
such that the behaviour is consistent; 'no_unit' should always raise an exception when used in the udunits processing rules, exactly as '?' does.

With regard to meaning, I have found the wikipedia entry useful:
http://en.wikipedia.org/wiki/Dimensionless_quantity
`In dimensional analysis <http://en.wikipedia.org/wiki/Dimensional_analysis>, a *dimensionless quantity* or *quantity of dimension one* is a quantity <http://en.wikipedia.org/wiki/Quantity> without an associated physical dimension <http://en.wikipedia.org/wiki/Dimensional_analysis>. It is thus a "pure" number, and as such always has a dimension of 1.^[1] <http://en.wikipedia.org/wiki/Dimensionless_quantity#cite_note-1> '
which it has sourced from
"*1.8* (1.6) *quantity of dimension one* dimensionless quantity" <http://www.iso.org/sites/JCGM/VIM/JCGM_200e_FILES/MAIN_JCGM_200e/01_e.html#L_1_8>. /International vocabulary of metrology --- Basic and general concepts and associated terms (VIM)/. ISO <http://en.wikipedia.org/wiki/International_Organization_for_Standardization>. 2008. Retrieved 2011-03-22.

This is a good enough source for me.

I will wait to give space for more comments, then, if people are content, I will raise a change request with udunits. Assuming this is accepted and processed I will raise a change request for CF to add some text to 3.1. Finally I will request a change for any standard_names which appear not to follow this approach (I have only 'area_type' so far).

I hope this seems like a reasonable response.

------------------------------------------------------------------------
*From:* Eizi TOYODA [toy...@gfd-dennou.org]
*Sent:* 31 October 2014 08:44
*To:* John Graybeal
*Cc:* Hedley, Mark; CF Metadata List
*Subject:* Re: [CF-metadata] string valued coordinates

Hi John

> I think '?' is not a definition that is helpful to most users -- it is more like an indication that the string -- the empty string in this case for example -- has not provided a meaningful indication of what the units are.

I share the same impression. I was thinking it would be nicer for maintener of udunits. We should ask modifying udunits so that it would refuse processing "no_units" otherwise ut_multiply("no_units", "no_units") returns "no_units 2". If I remember right the unit string "?" causes immediate error, so we don't have to change udunits.

But I'm okay if the majority here agrees that sort of thing is not a responsibility of udunits.

Best,
Eizi



Best Regards,
--
Eiji (aka Eizi) TOYODA
http://www.google.com/profiles/toyoda.eizi

On Fri, Oct 31, 2014 at 9:45 AM, John Graybeal <john.grayb...@marinexplore.com <mailto:john.grayb...@marinexplore.com>> wrote:

    Thanks for summing this up so neatly Mark!

    We could take the view that the conventions would benefit from
    the addition of some text into 3.1 to explicitly make the point
    about quantities which are not dimensioned or dimensionless.
    We could alternatively defer to udunits as most unit questions
    do, which already exhibits this behaviour, and just patch the
    'area_type' and any similar names with erroneous canonical units.
    We could also request that udunits be updated with a clearer
    string for this case, given the need for it, such as including
    the term 'no_units' as a valid udunits term to mean there are no
    units here: this is not dimensionless, this is not dimensioned.

    Why is the first option exclusive to the others? Seems useful to
    improve the documentation regardless.

    So I agree that '1' makes no sense for area_type. I'm wondering
    if someone can crisply describe what is meant when we (or
    UDUNITS) say a unit is dimensionless? I'm not entirely sure I get it.

    In any case, I think '?' is not a definition that is helpful to
    most users -- it is more like an indication that the string --
    the empty string in this case for example -- has not provided a
    meaningful indication of what the units are.

    So my ideal solution has CF well aligned with UDUNITS, and a
    clear concept and definition. Which I think suggests asking
    UDUNITS for a term 'no_units', defined as "the values do not have
    units; values are neither dimensioned nor dimensionless."

    John


    On Oct 30, 2014, at 11:06, Hedley, Mark
    <mark.hed...@metoffice.gov.uk
    <mailto:mark.hed...@metoffice.gov.uk>> wrote:

    > The unit of '1' is generally used to indicate fractions and
    the like. In cases where I am storing a raw binary value, I
    leave off the units attribute, as the 'number' isn't something
    that should be treated as a decimal quantity.

    This is the same behaviour as I was looking to adopt, but CF 3.1
    makes this incorrect, I believe, as a lack of a units attribute
    is to be interpreted as a units of '1'.

    I think a clear way to define that a quantity is not dimensioned
    and is not dimensionless is required.  I would have liked to use
    the lack of a unit for this purpose, but this has already been
    taken, so something else is needed.

    >My preference is that one explicitly puts in the units. For
    dimensionless, "1" or "" is ok for udunits.

    udunits2 treats '1' and '' differently.

      a unit of '1' has a definition of '1'
      a unit of '' has a definition of '?'

    The CF conventions description of units (3.1) states that an
    absence of a units attribute is deemed to be equivalent to
    dimensionless, a unit of '1'.  This is the convention, and it
    has been in force a long time.

    However CF makes no statement that I can find regarding a unit
    of ''. Thus I believe we defer back to udunits, which CF states
    is how units are defined.  Udunits states that a unit of '' is
    undefined, the quantity is not dimensioned and is not
    dimensionless.  We could adopt this to use for the cases in
    question.

    >area_type is given in the standard_name table as having a unit
    of 1. It is a categorical string-valued quantity.

    On the basis of the discussion, I would suggest that this is an
    error.  If area_type is a categorical string-valued quantity, it
    is not dimensionless, it is not continuous and numerical, it is
    not a pure number and should not be treated as such.  I think we
    should fix this.

    We could take the view that the conventions would benefit from
    the addition of some text into 3.1 to explicitly make the point
    about quantities which are not dimensioned or dimensionless.
    We could alternatively defer to udunits as most unit questions
    do, which already exhibits this behaviour, and just patch the
    'area_type' and any similar names with erroneous canonical units.
    We could also request that udunits be updated with a clearer
    string for this case, given the need for it, such as including
    the term 'no_units' as a valid udunits term to mean there are no
    units here: this is not dimensionless, this is not dimensioned.
    I don't mind which route is preferred, I'm happy to put a change
    together and pursue it; whichever way seems better to people.

    cheers
    mark

    ------------------------------------------------------------------------
    *From:*CF-metadata [cf-metadata-boun...@cgd.ucar.edu
    <mailto:cf-metadata-boun...@cgd.ucar.edu>] on behalf of Jim
    Biard [jbi...@cicsnc.org <mailto:jbi...@cicsnc.org>]
    *Sent:*30 October 2014 16:12
    *To:*cf-metadata@cgd.ucar.edu <mailto:cf-metadata@cgd.ucar.edu>
    *Subject:*Re: [CF-metadata] string valued coordinates

    CF says that if the units attribute is missing, then the
    quantity has no units.

    The Conventions document, section 3.1 says:

    The|units|attribute is required for all variables that represent
    dimensional quantities (except for boundary variables defined
    inSection 7.1, "Cell
    
Boundaries"<http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#cell-boundaries>and
    climatology variables defined inSection 7.4, "Climatological
    
Statistics"<http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#climatological-statistics>).

    and

    Units are not required for dimensionless quantities. A variable
    with no units attribute is assumed to be dimensionless. However,
    a units attribute specifying a dimensionless unit may optionally
    be included. The Udunits package defines a few dimensionless
    units, such as|percent|, but is lacking commonly used units such
    as ppm (parts per million). This convention does not support the
    addition of new dimensionless units that are not udunits
    compatible. The conforming unit for quantities that represent
    fractions, or parts of a whole, is "1". The conforming unit for
    parts per million is "1e-6". Descriptive information about
    dimensionless quantities, such as sea-ice concentration, cloud
    fraction, probability, etc., should be given in
    the|long_name|or|standard_name|attributes (see below) rather
    than the|units|.

    The unit of '1' is generally used to indicate fractions and the
    like. In cases where I am storing a raw binary value, I leave
    off the units attribute, as the 'number' isn't something that
    should be treated as a decimal quantity.

    Grace and peace,

    Jim

    On 10/30/14, 11:35 AM, John Caron wrote:
    My preference is that one explicitly puts in the units. For
    dimensionless, "1" or "" is ok for udunits. If the units
    attribute isnt there, I assume that the user forgot to specify
    it, so the units are unknown.

    Im not sure what CF actually says, but it would be good to clarify.

    John

    On Thu, Oct 30, 2014 at 2:37 AM, Hedley,
    Mark<mark.hed...@metoffice.gov.uk
    <mailto:mark.hed...@metoffice.gov.uk>>wrote:

        Hello CF

        > From: CF-metadata [cf-metadata-boun...@cgd.ucar.edu
        <mailto:cf-metadata-boun...@cgd.ucar.edu>] on behalf of
        Jonathan Gregory [j.m.greg...@reading.ac.uk
        <mailto:j.m.greg...@reading.ac.uk>]

        > Yes, there are some standard names which imply string
        values, as Karl says. If the standard_name table says 1,
        that means the quantity is dimensionless, so it's also fine
        to omit the units, as Jim says.

        I would like to raise question about this statement.
        Omitting the units and stating that the units are '1' are
        two very different things;
        dimensionless != no_unit
        is an important statement which should be clear to data
        consumers and producers.

        If the standard name table defines a canonical unit for a
        standard_name of '1' then I expect this quantity to be
        dimensionless, with a unit of '1' or some multiple there of.
        If the standard name states that the canonical unit for a
        standard_name is '' then I expect that quantity to have no
        unit stated.
        Any deviation from this behaviour is a break with the
        conventions. I have code which explicitly checks this for
        data sets.

        Are people aware of examples of the pattern of use
        described by Jonathan, such as a categorical quantities
        identified by a standard_name with a canonical unit of '1'?

        thank you
        mark

        _______________________________________________
        CF-metadata mailing list
        CF-metadata@cgd.ucar.edu <mailto:CF-metadata@cgd.ucar.edu>
        http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata




    _______________________________________________
    CF-metadata mailing list
    CF-metadata@cgd.ucar.edu  <mailto:CF-metadata@cgd.ucar.edu>
    http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

    --
    <iiagagce.png> <http://www.cicsnc.org/>Visit us on
    Facebook <http://www.facebook.com/cicsnc>     *Jim Biard*
    *Research Scholar*
    Cooperative Institute for Climate and Satellites
    NC<http://cicsnc.org/>
    North Carolina State University<http://ncsu.edu/>
    NOAA's National Climatic Data Center<http://ncdc.noaa.gov/>
    151 Patton Ave, Asheville, NC 28801
    e:jbi...@cicsnc.org <mailto:jbi...@cicsnc.org>
    o: +1 828 271 4900




    _______________________________________________
    CF-metadata mailing list
    CF-metadata@cgd.ucar.edu <mailto:CF-metadata@cgd.ucar.edu>
    http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


    _______________________________________________
    CF-metadata mailing list
    CF-metadata@cgd.ucar.edu <mailto:CF-metadata@cgd.ucar.edu>
    http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata




_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

--
CICS-NC <http://www.cicsnc.org/>Visit us on
Facebook <http://www.facebook.com/cicsnc>         *Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
e: jbi...@cicsnc.org
o: +1 828 271 4900





--
CICS-NC <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc>         *Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
e: jbi...@cicsnc.org
o: +1 828 271 4900




_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to