Re: [Crm-sig] Fixity Hash in CRM Addendum

2015-09-11 Thread Conal Tuohy
This might also be a good time to dip into FRBRoo.

On 11 September 2015 at 13:14, daniel riley  wrote:

> Hello folks,
>
> I'm adding a bit to this question since I think its relevant to anyone in
> digital preservation. If anyone finds it off-topic, let me know.
>
> So, where we left off was that perhaps E38_Image wasn't the best entity to
> express a digital image of an artwork since E38_Image doesn't specify a
> concrete manifestation of that image.  However, in the scope notes for
> P138_represents, it explicitly states:
>
> "This property is also used for the relationship between an original and
> a digitisation of the original by the use of techniques such as digital
> photography, flatbed or infrared scanning."
>
> So it seems like the property is correct for specifying a digital version
> of the work but perhaps the Range entity is incorrect. Should I simply be
> using the superclass E73_Information_Object rather than E38_Image as the
> range, if I want to specify a digital image file with a specific set of
> bytes?
>
> Thanks,
> Daniel Riley
>
> On Wed, Sep 9, 2015 at 6:07 PM, daniel riley  wrote:
>
>> Hi Simon,
>>
>> That makes sense. For instance, one image could have multiple sizes. We
>> would think about them as the same image but their hashes would be
>> completely different.  I am not as familiar with FRBRoo, but I took a look
>> at F4 Manifestation Singleton, and I'm not sure if its intention is
>> something like this.
>>
>> One thing that is confusing is that in many cases like in the british
>> museum example here:
>>
>> http://collection.britishmuseum.org/resource?uri=http%3A%2F%2Fwww.britishmuseum.org%2Fcollectionimages%2FAN00037%2FAN00037369_001_l.jpg
>>
>> The resource is a specific digital version of an image with a specific
>> asset id and a specific filename. So it would seem that if I added a
>> property about that resource it would be about the specific binary data,
>> and not about all possible versions of that image.
>>
>> If anyone knows of an example implementation that addresses fixity it
>> would be a great help.
>>
>> Thanks,
>> Dan
>>
>> P.S. I was using British Museum's linked data as a guide for most of my
>> work:
>>
>> On Wed, Sep 9, 2015 at 5:23 PM, Simon Spero  wrote:
>>
>>> Another problem with this is that a hash of a bit string does not
>>> identify an Image (even if the hash is 1:1).
>>>
>>> An Image is abstract and conceptual,  and has an identity is preserved
>>> across transformations that would generate different bit strings.
>>>
>>> Going the other way,  I believe that CIDOC does require that the same
>>> bit string not correspond to multiple images. For example, an imaging
>>> sensor might capture an image with the shutter closed at the start of a
>>> series of measurements - such an image could be used for calibration.
>>> Many such images might have identical bit strings, but would be
>>> conceptually different works under some stances. However,  since they have
>>> indistinguishable appearances, they are the same Image.
>>>
>>> Fixity hashes might be better treated as properties of a FRBRoo
>>> Manifestation; such properties are intrinsic to the Manifestation*; they
>>> are not externally assigned in the same way that a URI, accession number,
>>> etc are.
>>>
>>> Simon
>>> * or as a the value of a property that must be  the same for every item
>>> that is an instance of that Manifestation
>>> On Sep 9, 2015 4:15 PM, "daniel riley"  wrote:
>>>
 Hello all,

 I wanted to get confirmation on the correct application of the
 Cidoc-crm in the case of checksum hashes (i.e. fixity values).

 For instance if the hash of a digital image file computes to:
 6b8dca09e851a987050463c9c60603e9ad797ba09117056fc2e0c07bcac66e43

 My first thought would be to use:

 E38_Image - P1_is_identified_by - E42_Identifier (hash value)
 E42_Identifier - P2_has_type - "SHA256 HASH"

 However, the scope notes for E42_Identifier explicitly states:
 The class E42 Identifier is not normally used for machine-generated
 identifiers

 A hash is definitely machine generated, so what are the other options
 here? Should I use a different ontology for this case?

 Thanks,
 Daniel Riley
 Verisart

 ___
 Crm-sig mailing list
 Crm-sig@ics.forth.gr
 http://lists.ics.forth.gr/mailman/listinfo/crm-sig


>>
>
> ___
> Crm-sig mailing list
> Crm-sig@ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>
>


-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] Fixity Hash in CRM Addendum

2015-09-11 Thread daniel riley
Hello folks,

I'm adding a bit to this question since I think its relevant to anyone in
digital preservation. If anyone finds it off-topic, let me know.

So, where we left off was that perhaps E38_Image wasn't the best entity to
express a digital image of an artwork since E38_Image doesn't specify a
concrete manifestation of that image.  However, in the scope notes for
P138_represents, it explicitly states:

"This property is also used for the relationship between an original and a
digitisation of the original by the use of techniques such as digital
photography, flatbed or infrared scanning."

So it seems like the property is correct for specifying a digital version
of the work but perhaps the Range entity is incorrect. Should I simply be
using the superclass E73_Information_Object rather than E38_Image as the
range, if I want to specify a digital image file with a specific set of
bytes?

Thanks,
Daniel Riley

On Wed, Sep 9, 2015 at 6:07 PM, daniel riley  wrote:

> Hi Simon,
>
> That makes sense. For instance, one image could have multiple sizes. We
> would think about them as the same image but their hashes would be
> completely different.  I am not as familiar with FRBRoo, but I took a look
> at F4 Manifestation Singleton, and I'm not sure if its intention is
> something like this.
>
> One thing that is confusing is that in many cases like in the british
> museum example here:
>
> http://collection.britishmuseum.org/resource?uri=http%3A%2F%2Fwww.britishmuseum.org%2Fcollectionimages%2FAN00037%2FAN00037369_001_l.jpg
>
> The resource is a specific digital version of an image with a specific
> asset id and a specific filename. So it would seem that if I added a
> property about that resource it would be about the specific binary data,
> and not about all possible versions of that image.
>
> If anyone knows of an example implementation that addresses fixity it
> would be a great help.
>
> Thanks,
> Dan
>
> P.S. I was using British Museum's linked data as a guide for most of my
> work:
>
> On Wed, Sep 9, 2015 at 5:23 PM, Simon Spero  wrote:
>
>> Another problem with this is that a hash of a bit string does not
>> identify an Image (even if the hash is 1:1).
>>
>> An Image is abstract and conceptual,  and has an identity is preserved
>> across transformations that would generate different bit strings.
>>
>> Going the other way,  I believe that CIDOC does require that the same bit
>> string not correspond to multiple images. For example, an imaging sensor
>> might capture an image with the shutter closed at the start of a series of
>> measurements - such an image could be used for calibration.
>> Many such images might have identical bit strings, but would be
>> conceptually different works under some stances. However,  since they have
>> indistinguishable appearances, they are the same Image.
>>
>> Fixity hashes might be better treated as properties of a FRBRoo
>> Manifestation; such properties are intrinsic to the Manifestation*; they
>> are not externally assigned in the same way that a URI, accession number,
>> etc are.
>>
>> Simon
>> * or as a the value of a property that must be  the same for every item
>> that is an instance of that Manifestation
>> On Sep 9, 2015 4:15 PM, "daniel riley"  wrote:
>>
>>> Hello all,
>>>
>>> I wanted to get confirmation on the correct application of the Cidoc-crm
>>> in the case of checksum hashes (i.e. fixity values).
>>>
>>> For instance if the hash of a digital image file computes to:
>>> 6b8dca09e851a987050463c9c60603e9ad797ba09117056fc2e0c07bcac66e43
>>>
>>> My first thought would be to use:
>>>
>>> E38_Image - P1_is_identified_by - E42_Identifier (hash value)
>>> E42_Identifier - P2_has_type - "SHA256 HASH"
>>>
>>> However, the scope notes for E42_Identifier explicitly states:
>>> The class E42 Identifier is not normally used for machine-generated
>>> identifiers
>>>
>>> A hash is definitely machine generated, so what are the other options
>>> here? Should I use a different ontology for this case?
>>>
>>> Thanks,
>>> Daniel Riley
>>> Verisart
>>>
>>> ___
>>> Crm-sig mailing list
>>> Crm-sig@ics.forth.gr
>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>
>>>
>