Re: [Jprogramming] Parsing EDI data and converting them into a database format

chris burke Fri, 13 Nov 2015 10:55:05 -0800

I did this some years ago and found that J can parse any given EDI format
very efficiently, using cut to chop up the strings. You might need
different functions for specific EDI formats, rather than a single function
to parse arbitrary EDI.


On 13 November 2015 at 08:10, George Dallas <[email protected]> wrote:

> Hello,
>
> Has anyone had the chance to work with EDI data using J?
>
> EDI messages are text files formatted for facilitating business to business
> communications. If one has a sufficient large history of these files and
> manage to insert them into a database, then querying the database would
> give answers to many business questions regarding customers, costs etc.
>
> The link and text pasted below I found it to be a concise description of
> the problem.
>
> Of course there is a huge industry out there spun to deal with this
> problem, but I was wondering if anyone have had to tackle the issue using J
> and if you think it's a doable project for J.
>
> Regards,
> George
>
>
>
> https://github.com/pstuteville/x12
>
> == The problem
>
> X12 is a set of "standards" possessing all the elegance of an elephant
> designed by committee, and quite literally so, see http://www.x12.org.
> X12 defines rough syntax for specifying text messages, but each of
> more than 300 specifications defines its own message structure. While
> messages themselves are easy to parse with a simple tokenizer, their
> semantics is heavily dependent on the domain. For example, this is
> X12/997 message conveying "Functional Acknowledgment":
>
>   ST*997*2878~AK1*HS*293328532~AK2*270*307272179~AK3*NM1*8*L1010_0*8~
>   AK4*0:0*66*1~AK4*0:1*66*1~AK4*0:2*66*1~AK3*NM1*8*L1010_1*8~AK4*1:0*
>   66*1~AK4*1:1*66*1~AK3*NM1*8*L1010_2*8~AK4*2:0*66*1~AK5*R*5~AK9*R*1*
>   1*0~SE*8*2878~
>
> I.e., X12 defines an alphabet and somewhat of a dictionary - not a
> grammar or semantics for each particular data interchange
> conversation. Because of many entrenched implementations and
> government mandates, the X12 is not going to die anytime soon,
> unfortunately.
>
> The message above can be easily represented in Ruby as a nested array:
>
>  m = [
>       ['ST', '997', '2878'],
>       ['AK1', 'HS', '293328532'],
>       ['AK2', '270', '307272179'],
>       ['AK3', 'NM1', '8', 'L1010_0', '8'],
>       ['AK4', '0:0', '66', '1'],
>       ['AK4', '0:1', '66', '1'],
>       ['AK4', '0:2', '66', '1'],
>       ['AK3', 'NM1', '8', 'L1010_1', '8'],
>       ['AK4', '1:0', '66', '1'],
>       ['AK4', '1:1', '66', '1'],
>       ['AK3', 'NM1', '8', 'L1010_2', '8'],
>       ['AK4', '2:0', '66', '1'],
>       ['AK5', 'R', '5'],
>       ['AK9', 'R', '1', '1', '0'],
>       ['SE', '8', '2878'],
>      ]
>
> but it will not help any since, say, segment 'AK4' is ambiguously
> defined and its meaning not at all obvious until the message's
> structure is interpreted and correct 'AK4' segment is found.
>
> == The solution
>
> === Message structure
>
> Each participant in EDI has to know the structure of the data coming
> across the wire - X12 or no X12. The X12 structures are defined in
> so-called Implementation Guides - thick books with all the data pieces
> spelled out. There is no other choice, but to invent a
> computer-readable definition language that will codify these
> books. For familiarity sake we'll use XML. For example, the X12/997
> message can be defined as
>
>   <Definition>
>     <Loop name="997">
>       <Segment name="ST" min="1" max="1"/>
>       <Segment name="AK1" min="1" max="1"/>
>       <Loop name="L1000" max="999999" required="y">
>         <Segment name="AK2" max="1" required="n"/>
>         <Loop name="L1010" max="999999" required="n">
>           <Segment name="AK3" max="1" required="n"/>
>           <Segment name="AK4" max="99" required="n"/>
>         </Loop>
>         <Segment name="AK5" max="1" required="y"/>
>       </Loop>
>       <Segment name="AK9" max="1" required="y"/>
>       <Segment name="SE"  max="1" required="y"/>
>     </Loop>
>   </Definition>
>
> Namely, the 997 is a 'loop' containing segments ST (only one), AK1
> (also only one), another loop L1000 (zero or many repeats), segments
> AK9 and SE. The loop L1000 can contain a segment AK2 (optional) and
> another loop L1010 (zero or many), and so on.
>
> The segments' structure can be further defined as, for example,
>
>   <Segment name="AK2">
>     <Field name="TransactionSetIdentifierCode" required="y" min="3"
> max="3" validation="T143"/>
>     <Field name="TransactionSetControlNumber"  required="y" min="4"
> max="9"/>
>   </Segment>
>
> which defines a segment AK2 as having two fields:
> TransactionSetIdentifierCode and TransactionSetControlNumber. The
> field TransactionSetIdentifierCode is defined as having a type of
> string (default), being required, having length of minimum 3 and
> maximum 3 characters, and being validated against a table T143. The
> validation table is defined as
>
>   <Table name="T143">
>     <Entry name="100" value="Insurance Plan Description"/>
>     <Entry name="101" value="Name and Address Lists"/>
>     ...
>     <Entry name="997" value="Functional Acknowledgment"/>
>     <Entry name="998" value="Set Cancellation"/>
>   </Table>
>
> with entries having just names and values.
>
> This message is fully flashed out in an example 'misc/997.xml' file,
> copied from the ASC X12N 276/277 (004010X093) "Health Care
> Claim Status Request and Response" National Electronic Data
> Interchange Transaction Set Implementation Guide.
>
> Now expressions like
>
>   message.L1000.L1010[1].AK4.DataElementReferenceNumber
>
> start making sense of sorts, overall X12's idiocy notwithstanding - it's
> a field called 'DataElementReferenceNumber' of a first of possibly
> many segments 'AK4' found in the second repeat of the loop 'L1010'
> inside the enclosing loop 'L1000'. The meaning of the value '66' found
> in this field is still in the eye of the beholder, but, at least its
> location is clearly identified in the message.
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Parsing EDI data and converting them into a database format

Reply via email to