Re: [cc-devel] Exif metadata

Mike Linksvayer Sun, 20 Aug 2006 21:31:43 -0700

On Fri, 2006-08-18 at 22:03 -0400, Luis Villa wrote:
> [I'm not on cc-metadata, and it is deleting all my mail instead of
> holding it for moderation, FWIW, which may explain some of my missing
> context.]

Belatedly I've added you to a list of always accept addresses for
cc-metadata.  All public Creative Commons lists are set to reject email
from non-subscribers -- their moderation queues became nearly 100% spam
at some point.

I added cc-metadata to this discussion as it is where all related
discussions have occurred up to this point.  I'm thinking of putting
cc-metadata into hibernation and perhaps inviting cc-metadata
subscribers to subscribe to cc-devel.  The latter would never have been
created if the former's name did not signal a set of topics smaller than
the current scope of CC-related technical discussions.

> On 8/18/06, Mike Linksvayer <[EMAIL PROTECTED]> wrote:
> > Let me prefix this by saying I hate embedded metadata and would be happy
> > if nobody ever included a CC license notice in it but there's a there
> > there so some people feel a need use embedded metadata to note license
> > status AND there is a longstanding desire from CC to mitigate against
> > people adding fraudulent license claims to say madonna.mp3 and having
> > that be people's introduction to CC ... thus this onerous scheme.
> 
> Hrm. Interesting problem, but the reaction to it smells like premature
> optimisation to me. Now that wide-scale CC-enabled services like
> flickr have existed for a couple years, do we have any examples of
> this happening on any wide scale? I've never seen or heard an example
> of it, but I've certainly not been paying wide attention to it.

There wouldn't have been as nobody is putting any CC license info in
image files, direct or indirect.

The main problem with use of CC licensed images found on Flickr seems
lack of attribution (another reason to prefer a reference to the
copyright holder, not the license, which does not provide attribution).
One of my own corny images http://flickr.com/photos/mlinksva/164411876/
was used in http://trends.newsforge.com/trends/06/06/12/054221.shtml --
I have non-attribution of others' images, just happen to have a link to
mine. :) [Not that I mind -- I'd put my images in the public domain if
Flickr gave the option.]

In a field called "copyright" (which is what Exif offers) a URL
referencing the copyright holder also maps better to non-URL use than
does a license URL.  I haven't looked at many JPEGs, but in other
formats I see "John Smith", never "All Rights Reserved" in the similar
field.

MP3 files found in the wild (meaning not directly downloaded from
archive.org or similar) that have some sort of CC license indicator in
embedded metadata -- as far as I can tell, mostly artists that I have no
reason to think have CC licensed anything.  Just very casual
observation, no data.

> My gut
> feeling is that this would be a rare problem, and the best way to
> counter it would be to radically lower the barrier to tagging with
> correct metadata- drown out the bad data (if any) with a stream of
> good.
> 
> I might note that if you go with the 'simple' license URL, and the
> problem of false re-licensing becomes really bad, the worst case
> scenario is that you deprecate using the straight license URL and
> require publication. That cost seems much preferable to raising high
> barriers before the license standard is a success.
> 
> > See
> > discussion on this list probably starting in April 2003, though it is
> > probably missing context from internal CC discussions.
> 
> <nod>
> 
> > On Fri, 2006-08-18 at 20:42 -0400, Luis Villa wrote:
> > > On 8/18/06, Mike Linksvayer <[EMAIL PROTECTED]> wrote:
> > > > A web notice gives one the level of assurance that one normally gets
> > > > from the web ... as opposed to zero.
> > >
> > > Ah! yes. We raise it from zero to... practically zero :) Seriously,
> > > this buys no protection against any serious/meaningful attempts at
> > > fraud, while making it incredibly onerous for the vast, vast majority
> > > of the population that can't guarantee a permanent web presence.
> >
> > 1. Archive.org, flickr and the like provide permanent web presence for
> > them.
> 
> It's still a barrier to entry; worse, phrased that way, it is a
> barrier to entry *and* lockin to platforms that users don't control.
> Flickr changes their URL scheme slightly?

Yes, lock-in is major problem that needs to be solved -- anyone else
interested in this, I posted some links about it, including to Luis'
blog, at http://gondwanaland.com/mlog/2006/07/28/free-software-p2p/ --
but I don't think CC recommending that people publish copyright notice
on the web (which is what we do in the usual case anyway, the
recommendation at hand just makes embedded metadata not a special case)
makes the problem appreciably worse.

> Oops, all your data suddenly
> has no valid license.

No!  A CC license is "valid" for a work because a copyright holder has
offered it to the public.  [Non-]conformance with a technical
recommendation for annotating a work with license info does not make a
license [in]valid.  The best annotations can do is provide additional
context as to whether a valid offer was made.

In some cases web.archive.org may help.  If I was a high risk user of CC
licensed content among other things I'd save/make screenshots/print any
license notice (probably found on a web page) at the time I found the
putatively licensed content.  Hopefully someday "Digital Asset
Management" systems used by ad agencies and the like (and your web
superbrowser) will do this automagically.

> CC could perhaps resolve this problem by offering a web-facing central
> license registry, much like the PTO does. This could even integrate a
> basic license validation service- allow the metadata-reader to pass a
> checksum of the image to the web service, and use that to verify the
> work. I'd much rather trust CC as a license data repository than any
> other third party.

CC decided before it launched to not do that.
http://wiki.creativecommons.org/FAQ#Is_Creative_Commons_building_a_database_of_licensed_content.3F

I think the reasons were and are, roughly
* legal liability
* CC wouldn't get it right, better to let others have multiple tries,
the decentralized (semantic) web will eventually prove out (IIRC the
original CC FAQ, before my time, said something about eschewing
soviet-style centralization, which is a bit over the top)
* CC did not and does not have the resources necessary
* Granularity of a work problem

> > 2. A URI that dies is uncool.
> 
> Yes.
> 
> > The content musn't have been that
> > valuable.
> 
> No.
> 
> Email addresses that die are uncool; arguably worse than URIs that die
> (because they are the RI for a *person*, not just data) and yet out in
> the real world they happen. The never-ending, never-moving URI is a
> very nice luxury that the digital elite have, but I'd wager it isn't
> very common for most people.
> 
> And even if you disagree with that analysis, the goal is to have *lots
> of people* licensing *lots of content*- not only "valuable" content.
> If Free Software licenses vanished into the ether, we'd be a lot
> poorer- stumbling across old code somewhere that no longer has a home,
> and finding it a new home, is one of the little things that helps free
> software succeed; CC should learn from that.
> 
> (Tangentially, isn't one of Lessig's key policy goals that we should
> make it easier to find out copyright status of old materials, not
> harder?)

I don't disagree with any of the above, but I'm not sure how a bare
license URL would be enough for anyone who actually cares about
copyright status to feel comfortable using lost and found material.  

> > > > > That seems incredibly onerous.
> > > >
> > > > It may be, but if I may repeat myself, embedding a reference to a
> > > > license itself is incredibly worthless.
> > >
> > > You're demanding a higher level of accountability with this than with
> > > any other licensing system I've ever seen. When I publish my code
> > > under GPL, I don't include a link in the source saying 'this is a link
> > > to a webpage 'proving' that the code is under GPL', I just do it.
> > > People publish books under CC all the time which just say 'the license
> > > is foo', even though PDFs, HTML, and text are all editable- just like
> > > the exif fields. I'm really not clear why EXIFs, as opposed to any
> > > other editable content format ever, deserve this special publisher
> > > burden.
> >
> > Printed books and code have provide lots of other context by which one
> > can judge provenance
> 
> Surely madonna fans can tell the provenance of a madonna song, if that
> is the concern?
> 
> I do see the point, in the other response, that images and music are
> different from text/code, in that there is no way to express the
> license 'up front'. If anything, though, it feels to me that this is a
> good reason to make machine-readable metadata

Yes, a good reason to reference machine-readable metadata reasonably
colocated with human-visible equivalent statements, i.e., on the web. :)

> > and there's no (or precious little) attempt to make
> > printed copyright notices or license headers/COPYRIGHT.txt accompanying
> > code machine readable.
> 
> Which is a mistake, but tangential to this discussion :)
> 
> > However, an alternative is to make embedded metadata less machine
> > readable.
> 
> Nonono! Machine-readable licenses lower the barrier to remix and
> reuse- which should be a critical goal for CC.

Good, that's the conclusion I cam to regarding the silly "Copyright ...
verify at ..." English sentence CC once recommended for MP3/ID3 awhile
back.

-- 
  http://wiki.creativecommons.org/User:Mike_Linksvayer

_______________________________________________
cc-devel mailing list
[email protected]
http://lists.ibiblio.org/mailman/listinfo/cc-devel

Re: [cc-devel] Exif metadata

Reply via email to