OIDs / II
Bert Verhees wrote: presented as belonging together, it is not that simple, a collection of information can have more then one number, and the CEN-standard does not provide meta-information in some cases, f.e. PatientExtendedInformation carries a SetII, which is a set of numbers (identifiers), those numbers in a list do not have meta-information, except for the OID, but that meta-information can only be resolved over a network-service (which does not yet exist). as I have indicated before, I think that CEN needs the data type we added to openEHR - DV_INDENTIFIER, whose definition is below. I also don't agree that it is realistic to identify everything with OIDs. There are three reasons, the major one Bert has already given. 1. non-accessibility and/or performance of resolving engine 2. size of ids inside data, particularly data fragments that can never be sensibly accessed globally, only from within the context of some larger blob. E.g. it doesn't make sense to access a single ELEMENT in a CEN CLUSTER/ELEMENT tree, so why attach a 20 digit Oid to it? class DV_IDENTIFIER inherit DATA_VALUE feature -- Access issuer: STRING -- Issuing agency of these kind of ids id: STRING -- The identifier value. Often structured, according to the -- definition of the issuing authority???s rules. type: STRING -- The identifier type, such as ???prescription???, or ???SSN???. -- One day a controlled vocabulary might be possible for this. invariant issuer_valid: issuer /= Void and not issuer.is_empty id_valid: id /= Void and not id.is_empty type_valid: type /= Void and not type.is_empty end - thomas beale - If you have any questions about using this list, please send a message to d.lloyd at openehr.org
OIDs / II
Op woensdag 27 april 2005 11:21, schreef Thomas Beale: Bert Verhees wrote: presented as belonging together, it is not that simple, a collection of information can have more then one number, and the CEN-standard does not provide meta-information in some cases, f.e. PatientExtendedInformation carries a SetII, which is a set of numbers (identifiers), those numbers in a list do not have meta-information, except for the OID, but that meta-information can only be resolved over a network-service (which does not yet exist). as I have indicated before, I think that CEN needs the data type we added to openEHR - DV_INDENTIFIER, whose definition is below. I also It would help me a lot if CEN would take this instead of the original ID, this one has all what is suitable for my purpose Bert don't agree that it is realistic to identify everything with OIDs. There are three reasons, the major one Bert has already given. 1. non-accessibility and/or performance of resolving engine 2. size of ids inside data, particularly data fragments that can never be sensibly accessed globally, only from within the context of some larger blob. E.g. it doesn't make sense to access a single ELEMENT in a CEN CLUSTER/ELEMENT tree, so why attach a 20 digit Oid to it? class DV_IDENTIFIER inherit DATA_VALUE feature -- Access issuer: STRING -- Issuing agency of these kind of ids id: STRING -- The identifier value. Often structured, according to the -- definition of the issuing authority???s rules. type: STRING -- The identifier type, such as ???prescription???, or ???SSN???. -- One day a controlled vocabulary might be possible for this. invariant issuer_valid: issuer /= Void and not issuer.is_empty id_valid: id /= Void and not id.is_empty type_valid: type /= Void and not type.is_empty end - thomas beale - If you have any questions about using this list, please send a message to d.lloyd at openehr.org -- Met vriendelijke groet Bert Verhees ROSA Software - If you have any questions about using this list, please send a message to d.lloyd at openehr.org
OIDs / II
Dear all, My ideas: - unique identifiers are numbers that are unique. - each collection of information that has an attribute with this unique number can be collected and presented as belonging together, - with one unique identifier per (pseudo)identity all information belonging to this unique identifier can be collected and presented as belonging together - this type of use is identifying documents (or parts of it) as containing information about the same person with a specific identity. - it is NO PROOF of the real identity of the person. That is a different matter. - When we have to uniquely identify persons we need other things than numbers. - Unique numbers must not be trusted. - Unique numbers that identify persons generate problems: identity theft. - Only knowledge that is known by the person, or features his body posesses, will help to identify persons. Gerard -- private -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands +31 252 544896 +31 654 792800 On 20 Apr 2005, at 12:43, Bert Verhees wrote: Dear Grahame, For example the CEN GPIC subjectofcare which has a property id The type is a Set of II The use is excplained as: An identifier or identifiers that may be used to uniquely identify the subject of care. Examples: social security number, health service number, hospital number, case notes number Please indicate where there is a mismatch between the intention and the use of II. CENTC251 could learn from that. It would be a great benefit to the standard if this would be sorted out. And if it will, then the need for an extra qualifier to tell which the type of identifier is presented, may disappear, depending on your solution Kind regards Bert Verhees -- next part -- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 1829 bytes Desc: not available URL: http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20050426/14704952/attachment.bin
OIDs / II
Op dinsdag 26 april 2005 07:52, schreef Gerard Freriks: Dear all, My ideas: - unique identifiers are numbers that are unique. this is not true, not the numbers are unique, the number in context of something (social security, insurancenumber, etc) is or should be unique. - each collection of information that has an attribute with this unique number can be collected and presented as belonging together, it is not that simple, a collection of information can have more then one number, and the CEN-standard does not provide meta-information in some cases, f.e. PatientExtendedInformation carries a SetII, which is a set of numbers (identifiers), those numbers in a list do not have meta-information, except for the OID, but that meta-information can only be resolved over a network-service (which does not yet exist). f.e. You retrieve a PatientExtendInformation-object from a information-system. And it carries a few numbers. You have to know which one is the socialsecurity number, you have to resolve the OID. That is possible, when you have an Internet connection and the resolving OID-service is up and running, which not always may be the case. We are talking about a standard CEN, which has the intention to solve all information problems world-wide Sometimes you don't have an Internet-connection (f.e. firewall restrictions) Sometimes the OID is not know at the resolving service (OID from an other country, which has no resolving OID exchange with our country) Sometimes there is no OID (a less developed country) Sometimes the resolving service is unreachable (down, hacked, whatever) But, even when this is all working well, the following situation You are online datamining, and you do not retrieve one PatientExtendedInformation, but 1.000.000. The fact that you have to resolve all OID's will slow your datamining down very much, and unnecessary. It could slow down that much that it is unacceptabel for a customer, and the world has to live without that datamining application, or the customer will want to look for another standard to work with. I once wrote an application which did analyse firewall-logging, the analysing was a matter of seconds, some nifty mathematical algorithms over the logging database. But then the customer also wanted to know from which companies the IP-addresses where coming from, so the analysing application had to resolve the IP-addresses. Happily, DNS is a very good system, worldwide implemented (although there are problems with NAT, which is sometimes region wide -implemented (China) because of lack of unfair sharing of availble IP-addresses) The customer was not happy, first the application slowed down, what was first done in a few seconds, took an hour ore more (factor 1000 or more), second, the result was not satisfactory because of NAT and other resolving issues. This problem can easily be solved when the II-object is extended with a qualifier which tells us what kind of an II you are looking at. f.e. You want to know at which insurance company a million of patients are insured, and every patient carries 10 numbers, without this qualifier you have to resolve 10 numbers from each patient to find that one which is of interest, that means 10.000.000 resolving actions, where 1.000.000 would do if there was a qualifier, it means 9.000.000 resolving actions too many - with one unique identifier per (pseudo)identity all information belonging to this unique identifier can be collected and presented as belonging together - this type of use is identifying documents (or parts of it) as containing information about the same person with a specific identity. - it is NO PROOF of the real identity of the person. That is a different matter. - When we have to uniquely identify persons we need other things than numbers. - Unique numbers must not be trusted. - Unique numbers that identify persons generate problems: identity theft. - Only knowledge that is known by the person, or features his body posesses, will help to identify persons. It, thus, its only use is not to identify a person, that is only one purpose of an information system. Also there is an other problem with OID's, a identity may not have an OID, I guess this will happen a lot, certainly in the coming few years. In that case, there has to be an OID which indicates that there is no-one. This is necessary because OID is a mandatory property in the II-type. In that case, your need for a qualifier is even more urgent. There may be other solutions then a qualifier to this problem, but the current situation in the standard is in my opinion not sufficient regards Bert Verhees Gerard -- private -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands +31 252 544896 +31 654 792800 On 20 Apr 2005, at 12:43, Bert Verhees wrote: Dear Grahame, For example the CEN GPIC subjectofcare which has a property id The type is a Set of II The use is
Fwd: Re: OIDs / II
-- Doorgestuurd bericht -- Subject: Re: OIDs / II Date: woensdag 20 april 2005 10:33 From: Bert Verhees bert.verh...@rosa.nl To: openehr-technical at openehr.org Grahame Grieve wrote: Regarding OIDs and II, I think there's a misunderstanding about the point of the OID part of the II The OID is computationally opaque - it's not intended to tell you anything about the identifier, only to make it unique. Any scheme based on reverse engineering the OID into some form of information about the II that contains it will founder. Even if we turned around and based it on DNS, it still founders. The point is that the context of the II tells you what kind of identifier it is. OK, I lie. I see that in it's common usage, certainly in HL7 and apparently in CEN and openEHR - though I haven't checked - things have a SETII or equivalent, which requires that the II semantics themselves tell you more than simply the value of the identifier. So I see a mismatch between the intention of II and it's use. Dear Grahame, For example the CEN GPIC subjectofcare which has a property id The type is a Set of II The use is excplained as: An identifier or identifiers that may be used to uniquely identify the subject of care. Examples: social security number, health service number, hospital number, case notes number Please indicate where there is a mismatch between the intention and the use of II. CENTC251 could learn from that. It would be a great benefit to the standard if this would be sorted out. And if it will, then the need for an extra qualifier to tell which the type of identifier is presented, may disappear, depending on your solution Kind regards Bert Verhees The problem with information about the identifier is that it's rather hard to come up with a formal strategy for providing metainformation about the II. Take for instance, the HL7 V2 mess of codesets for describing the source identifier type. As a HL7 V2 implementor, I know that it's done by site specific decisions, since the mess is too far out of control. And V2.6 will only make this worse. Rather like doing it by OID, and configuring it by hand. I wouldn't want to be writing a general nation-wide HL7 receiver that had to sort out identifiers. No wait, I did write such a spec for Australian usage. And we took considerable care to nail down the Identifier usage so that this situation didn't arise. From my perspective (V2 implementor and V3 data types editor), the only possible solution - though I very much doubt it's feasible - is for there to be an LDAP based health OID repository to allow dynamic querying of the OID metadata. Sure, this would mean that you'd need a live internet connection to do the query, but if you really are writing a general case data mining tool with no ability to nail the identifiers down in your incoming data, then this is the least of your worries. Grahame --- -- Met vriendelijke groet Bert Verhees ROSA Software - If you have any questions about using this list, please send a message to d.lloyd at openehr.org
OIDs / II
Op dinsdag 15 maart 2005 11:44, schreef Grahame Grieve: Regarding OIDs and II, I think there's a misunderstanding about the point of the OID part of the II The OID is computationally opaque - it's not intended to tell you anything about the identifier, only to make it unique. Any scheme based on reverse engineering the OID into some form of information about the II that contains it will founder. Even if we turned around and based it on DNS, it still founders. The point is that the context of the II tells you what kind of identifier it is. What if the context is patientExtendedInformation? A patient can carry many identifiers. OK, I lie. I see that in it's common usage, certainly in HL7 and apparently in CEN and openEHR - though I haven't checked - things have a SETII or equivalent, which requires that the II semantics themselves tell you more than simply the value of the identifier. So I see a mismatch between the intention of II and it's use. I don't know what the intention was, but the use, is to present one or more identifiers, connected to an entity. It seems to me a good use. The problem with information about the identifier is that it's rather hard to come up with a formal strategy for providing metainformation about the II. Take for instance, the HL7 V2 mess of codesets for describing the source identifier type. As a HL7 V2 implementor, I know that it's done by site specific decisions, since the mess is too far out of control. And V2.6 will only make this worse. Rather like doing it by OID, and configuring it by hand. I wouldn't want to be writing a general nation-wide HL7 receiver that had to sort out identifiers. I do not want to substitute OID by metainformation, but add metainformation to II, as an extra property. A qualifier, like insurance_number, social_security_number, local_system_id, etc No wait, I did write such a spec for Australian usage. And we took considerable care to nail down the Identifier usage so that this situation didn't arise. From my perspective (V2 implementor and V3 data types editor), the only possible solution - though I very much doubt it's feasible - is for there to be an LDAP based health OID repository to allow dynamic querying of the OID metadata. Sure, this would mean that you'd need a live internet connection to do the query, but if you really are writing a general case data mining tool with no ability to nail the identifiers down in your incoming data, then this is the least of your worries. Even the least of all worries still is a worry, and it depends. I am not writing a general datamining tool, but I am writing a CEN layer to present data from local GP-systems in a standarized form. This works fine, and third parties know where they can find their data, independent of the underlying GP-system. And the patient carries a few identifiers, in a Set, and I want to mark one of the identifiers as an insurance-number. That is all. Though, the layer I am writing can be used for datamining, and will be used as such, but there are many more uses. I have no complaint about market-interests. Grahame -- Met vriendelijke groet Bert Verhees ROSA Software - If you have any questions about using this list, please send a message to d.lloyd at openehr.org