Re: [Jprogramming] Parsing EDI data and converting them into a database format

George Dallas Fri, 13 Nov 2015 13:56:18 -0800

Thanks Raul! I would have totally missed that and I think this cut
must also be what Joe used in his chop function.


Regards,

George

*--------------------------------------------------------------------------------------------------------------------------------------------------------*

Raul Miller rauldmiller at gmail.com
<programming%40forums.jsoftware.com?Subject=Re%3A%20%5BJprogramming%5D%20Parsing%20EDI%20data%20and%20converting%20them%20into%20a%0A%20database%20format&In-Reply-To=%3CCAD2jOU88de%2BNpjHGCzOffJMYNwrtG1S00KbQFYits1McDu3f6Q%40mail.gmail.com%3E>
*Fri Nov 13 20:30:03 UTC 2015*

Please keep in mind that there's cut and there's

http://www.jsoftware.com/help/dictionary/d331.htm

Here's the cut which I think Chris was referring to:

   cut
' '&$: :([: -.&a: <;._2@,~)

That verb actually
useshttp://www.jsoftware.com/help/dictionary/d331.htm but it's
predefined
to break on a specific character (which defaults to ' ' but can be
specified as the left argument).

   cut 'this is a test'
+----+--+-+----+
|*this|is|a|test|
*+----+--+-+----+
   't' cut 'this is a test'
+---------+--+
|*his is a |es|
*+---------+--+

I hope this helps,

-- 
Raul


On Fri, Nov 13, 2015 at 3:49 PM, George Dallas <[email protected]>
wrote:

> Joe, this is amazing!! What an incredibly powerful one-line function!! This 
> is not just a step in the right direction, but actually you're putting me in 
> a canon and shoot me flying towards the right direction :-)
>
> Of course there is a lot to study here for me, but enabling me to study in 
> the context of the problem is extremely helpful.
>
> Thank you very much!
>
> George
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Joe Bogner joebogner at gmail.com  
> <programming%40forums.jsoftware.com?Subject=Re%3A%20%5BJprogramming%5D%20Parsing%20EDI%20data%20and%20converting%20them%20into%20a%0A%20database%20format&In-Reply-To=%3CCAEtzV1a2eOLUNeq_%3DU0VxarbxqXM0pyU0Coj%3D97fYK0KcAL8Bg%40mail.gmail.com%3E>
> *Fri Nov 13 20:07:31 UTC 2015*
>
> George, here's some ideas to get you started:
>
> Forgive the terrible display in email. See 
> this:https://gist.github.com/joebo/42c914ba332c9e5d628c
>
>
> msg=: 0 : 0
> ST*997*2878~AK1*HS*293328532~AK2*270*307272179~
> AK3*NM1*8*L1010_0*8~AK4*0:0*66*1~AK4*0:1*66*1~AK4*0:2*
> 66*1~AK3*NM1*8*L1010_1*8~AK4*1:0*66*1~AK4*1:1*66*1~AK3*
> NM1*8*L1010_2*8~AK4*2:0*66*1~AK5*R*5~AK9*R*1*1*0~SE*8*2878~
> )
>
> NB. 
> fromhttps://www.ameren.com/-/media/corporate-site/Files/BusinessPartners/CPWG/CPWGIL814E-Request.pdf
> msg2 =: 0 : 0
> ST*814*0001
> BGN*13*2010063000001*20100630
> N1*8S*UTILITY*1*006912345
> N1*SJ*SUPPLIER*9*007909111IL00
> N1*8R*CUSTOMER NAME
> LIN*1*SH*EL*SH*CE
>  ASI*7*021
>  REF*11*0012345600
>  REF*12*0312345624
>  REF*BLT*LDC
>  REF*PC*DUAL
>  REF*9V*Y
> SE*13*0001
> )
>
> chop=: >@: ((('*' cut ]&dlb) each each) @: ('~' cut each ]) @: (LF cut ]))
>
>
> On Fri, Nov 13, 2015 at 1:31 PM, George Dallas <[email protected]>
> wrote:
>
>> Hi Chris, thank you for the reply. I'll start studying J's cut. It looks
>> like it'll require some hard studying from what I see in the dictionary
>> entry for cut (pasted below).
>>
>> Regards,
>> George
>>
>> *Cut *m;.n  u;.n  _ 1/2 _
>>
>> x u;.0 y applies u to a rectangle or cuboid of y with one vertex at the
>> point in y indexed by v=:0{x , and with the opposite vertex determined
>> as follows: the dimension is |1{x , but the rectangle extends *back* from
>>  v along any axis j for which the index j{v is negative. Finally, the
>> order of the selected items is reversed along each axis k for which
>>  k{1{x is negative. If xis a vector, it is treated as the matrix 0,:x .
>>
>>
>>
>> ----------------------------------------------------------------------------------------------------------------------------------
>> chris burke cburke at jsoftware.com
>> <programming%40forums.jsoftware.com?Subject=Re%3A%20%5BJprogramming%5D%20Parsing%20EDI%20data%20and%20converting%20them%20into%20a%0A%20database%20format&In-Reply-To=%3CCAAK_udWVCzatMug3QR7JqkaN03BCJ3Hy6d-Xuh1hGx2ukEFisA%40mail.gmail.com%3E>
>> *Fri Nov 13 18:53:56 UTC 2015*
>>
>> I did this some years ago and found that J can parse any given EDI format
>> very efficiently, using cut to chop up the strings. You might need
>> different functions for specific EDI formats, rather than a single function
>> to parse arbitrary EDI.
>>
>>
>> On Fri, Nov 13, 2015 at 12:36 PM, George Dallas <[email protected]>
>> wrote:
>>
>>> Hi Joe, thank you for your reply. I am indeed thinking about a subset of 
>>> X12 messages and specifically 20 types of utility exchanges with power 
>>> suppliers, found on the link here: 
>>> https://www.ameren.com/business-partners/cpwg/illinois-edi-implementation-guide.
>>>
>>> The x12parser you mentioned is a good and extensive project and with a 
>>> little work it might provide for what I need, but it's the verbosity of C# 
>>> used there that drives me towards thinking of a cleaner version that 
>>> possibly could be implemented in J.
>>>
>>> I'm wondering if given any specification, say the 997 you mentioned below, 
>>> the essence of the problem of converting an edi message to a flat file in 
>>> normalized form can be expressed concisely in J. If that were the case, I 
>>> suspect it would scale better and be a much faster implementation.
>>>
>>> If I were to go down this route are there any J facilities you'd recommend 
>>> for parsing and transforming text files?
>>>
>>> Thank you,
>>>
>>> George
>>>
>>>
>>> ------------------------------------------------------------------------------------------------------
>>>
>>> On Fri, Nov 13, 2015 at 11:10 AM, George Dallas <george.dallas at gmail.com 
>>> <http://jsoftware.com/mailman/listinfo/programming>> wrote:
>>> >* Hello,
>>> *>>* Has anyone had the chance to work with EDI data using J?
>>> *
>>> Hi George, I have not, but I spent a few minutes looking into it.
>>>
>>> >>* Of course there is a huge industry out there spun to deal with this
>>> *>* problem, but I was wondering if anyone have had to tackle the issue 
>>> using J
>>> *>* and if you think it's a doable project for J.
>>> *>
>>> I think we would need a bit more information about what you see for
>>> the project. Are you interested in building a library in J capable of
>>> parsing and interpreting all the various types of X12 messages or do
>>> you just need to work with a subset?
>>>
>>> If you were working with a small subset then I would consider
>>> implementing just what is necessary to parse those messages. If it's
>>> many messages, then I would lean towards integrating with something
>>> that has already solved the problem. The spec sounds reasonably
>>> complex and to make use of the information, the definitions are
>>> required.
>>>
>>> Here's one possible implementation to work with: 
>>> https://x12parser.codeplex.com/
>>>
>>> Here's the 997 specification out of the nearly 1000 options
>>> https://x12parser.codeplex.com/SourceControl/latest#trunk/src/OopFactory.X12/Specifications/Ansi-997-4010Specification.xml
>>>
>>>
>>> On Fri, Nov 13, 2015 at 10:10 AM, George Dallas <[email protected]
>>> > wrote:
>>>
>>>> Hello,
>>>>
>>>> Has anyone had the chance to work with EDI data using J?
>>>>
>>>> EDI messages are text files formatted for facilitating business to
>>>> business communications. If one has a sufficient large history of these
>>>> files and manage to insert them into a database, then querying the database
>>>> would give answers to many business questions regarding customers, costs
>>>> etc.
>>>>
>>>> The link and text pasted below I found it to be a concise description
>>>> of the problem.
>>>>
>>>> Of course there is a huge industry out there spun to deal with this
>>>> problem, but I was wondering if anyone have had to tackle the issue using J
>>>> and if you think it's a doable project for J.
>>>>
>>>> Regards,
>>>> George
>>>>
>>>>
>>>>
>>>> https://github.com/pstuteville/x12
>>>>
>>>> == The problem
>>>>
>>>> X12 is a set of "standards" possessing all the elegance of an elephant
>>>> designed by committee, and quite literally so, see http://www.x12.org.
>>>> X12 defines rough syntax for specifying text messages, but each of
>>>> more than 300 specifications defines its own message structure. While
>>>> messages themselves are easy to parse with a simple tokenizer, their
>>>> semantics is heavily dependent on the domain. For example, this is
>>>> X12/997 message conveying "Functional Acknowledgment":
>>>>
>>>>   ST*997*2878~AK1*HS*293328532~AK2*270*307272179~AK3*NM1*8*L1010_0*8~
>>>>   AK4*0:0*66*1~AK4*0:1*66*1~AK4*0:2*66*1~AK3*NM1*8*L1010_1*8~AK4*1:0*
>>>>   66*1~AK4*1:1*66*1~AK3*NM1*8*L1010_2*8~AK4*2:0*66*1~AK5*R*5~AK9*R*1*
>>>>   1*0~SE*8*2878~
>>>>
>>>> I.e., X12 defines an alphabet and somewhat of a dictionary - not a
>>>> grammar or semantics for each particular data interchange
>>>> conversation. Because of many entrenched implementations and
>>>> government mandates, the X12 is not going to die anytime soon,
>>>> unfortunately.
>>>>
>>>> The message above can be easily represented in Ruby as a nested array:
>>>>
>>>>  m = [
>>>>       ['ST', '997', '2878'],
>>>>       ['AK1', 'HS', '293328532'],
>>>>       ['AK2', '270', '307272179'],
>>>>       ['AK3', 'NM1', '8', 'L1010_0', '8'],
>>>>       ['AK4', '0:0', '66', '1'],
>>>>       ['AK4', '0:1', '66', '1'],
>>>>       ['AK4', '0:2', '66', '1'],
>>>>       ['AK3', 'NM1', '8', 'L1010_1', '8'],
>>>>       ['AK4', '1:0', '66', '1'],
>>>>       ['AK4', '1:1', '66', '1'],
>>>>       ['AK3', 'NM1', '8', 'L1010_2', '8'],
>>>>       ['AK4', '2:0', '66', '1'],
>>>>       ['AK5', 'R', '5'],
>>>>       ['AK9', 'R', '1', '1', '0'],
>>>>       ['SE', '8', '2878'],
>>>>      ]
>>>>
>>>> but it will not help any since, say, segment 'AK4' is ambiguously
>>>> defined and its meaning not at all obvious until the message's
>>>> structure is interpreted and correct 'AK4' segment is found.
>>>>
>>>> == The solution
>>>>
>>>> === Message structure
>>>>
>>>> Each participant in EDI has to know the structure of the data coming
>>>> across the wire - X12 or no X12. The X12 structures are defined in
>>>> so-called Implementation Guides - thick books with all the data pieces
>>>> spelled out. There is no other choice, but to invent a
>>>> computer-readable definition language that will codify these
>>>> books. For familiarity sake we'll use XML. For example, the X12/997
>>>> message can be defined as
>>>>
>>>>   <Definition>
>>>>     <Loop name="997">
>>>>       <Segment name="ST" min="1" max="1"/>
>>>>       <Segment name="AK1" min="1" max="1"/>
>>>>       <Loop name="L1000" max="999999" required="y">
>>>>         <Segment name="AK2" max="1" required="n"/>
>>>>         <Loop name="L1010" max="999999" required="n">
>>>>           <Segment name="AK3" max="1" required="n"/>
>>>>           <Segment name="AK4" max="99" required="n"/>
>>>>         </Loop>
>>>>         <Segment name="AK5" max="1" required="y"/>
>>>>       </Loop>
>>>>       <Segment name="AK9" max="1" required="y"/>
>>>>       <Segment name="SE"  max="1" required="y"/>
>>>>     </Loop>
>>>>   </Definition>
>>>>
>>>> Namely, the 997 is a 'loop' containing segments ST (only one), AK1
>>>> (also only one), another loop L1000 (zero or many repeats), segments
>>>> AK9 and SE. The loop L1000 can contain a segment AK2 (optional) and
>>>> another loop L1010 (zero or many), and so on.
>>>>
>>>> The segments' structure can be further defined as, for example,
>>>>
>>>>   <Segment name="AK2">
>>>>     <Field name="TransactionSetIdentifierCode" required="y" min="3" 
>>>> max="3" validation="T143"/>
>>>>     <Field name="TransactionSetControlNumber"  required="y" min="4" 
>>>> max="9"/>
>>>>   </Segment>
>>>>
>>>> which defines a segment AK2 as having two fields:
>>>> TransactionSetIdentifierCode and TransactionSetControlNumber. The
>>>> field TransactionSetIdentifierCode is defined as having a type of
>>>> string (default), being required, having length of minimum 3 and
>>>> maximum 3 characters, and being validated against a table T143. The
>>>> validation table is defined as
>>>>
>>>>   <Table name="T143">
>>>>     <Entry name="100" value="Insurance Plan Description"/>
>>>>     <Entry name="101" value="Name and Address Lists"/>
>>>>     ...
>>>>     <Entry name="997" value="Functional Acknowledgment"/>
>>>>     <Entry name="998" value="Set Cancellation"/>
>>>>   </Table>
>>>>
>>>> with entries having just names and values.
>>>>
>>>> This message is fully flashed out in an example 'misc/997.xml' file,
>>>> copied from the ASC X12N 276/277 (004010X093) "Health Care
>>>> Claim Status Request and Response" National Electronic Data
>>>> Interchange Transaction Set Implementation Guide.
>>>>
>>>> Now expressions like
>>>>
>>>>   message.L1000.L1010[1].AK4.DataElementReferenceNumber
>>>>
>>>> start making sense of sorts, overall X12's idiocy notwithstanding - it's
>>>> a field called 'DataElementReferenceNumber' of a first of possibly
>>>> many segments 'AK4' found in the second repeat of the loop 'L1010'
>>>> inside the enclosing loop 'L1000'. The meaning of the value '66' found
>>>> in this field is still in the eye of the beholder, but, at least its
>>>> location is clearly identified in the message.
>>>>
>>>>
>>>>
>>>
>>
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Parsing EDI data and converting them into a database format

Reply via email to