Re: [Tutor] converting EBCIDIC to ASCII
I am trying to convert an EBCIDIC file to ASCII, when the records are fixed length I can convert it fine, I have some files that are coming in as variable length records, is there a way to convert the file in Python? I tried using no length but then it just reads in to a fixed buffer size and I can't seem to break the records properly Hi Craig, You might find it easier to pass the records through iconv if you're on a Linux/Unix box and convert to ISO8859 from IBM037 (or whatever codepage your ENCDIC files are in). There are versions of this gnu software for Windows too, if that's your platform - it's trivial to use. Shout if you need a hand. Saying that, you'll almost certainly find that the 4 byte RDW has been stripped from the file when it was sent to you, so you're not being given any information to determine the length of each variable length record. Quick way to check - open the EBCDIC file up in an hex editor (I use HxD (from http://mh-nexus.de/en/hxd/as it will happily run in EBCDIC mode). If you can't see 4 bytes at the start of each record, then you're in trouble as you have no way of determining the record length, without the copybook for the file on the mainframe. S. This email and any attachment to it are confidential. Unless you are the intended recipient, you may not use, copy or disclose either the message or any information contained in the message. If you are not the intended recipient, you should delete this email and notify the sender immediately. Any views or opinions expressed in this email are those of the sender only, unless otherwise stated. All copyright in any Capita material in this email is reserved. All emails, incoming and outgoing, may be recorded by Capita and monitored for legitimate business purposes. Capita exclude all liability for any loss or damage arising or resulting from the receipt, use or transmission of this email to the fullest extent permitted by law. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] converting EBCIDIC to ASCII
-Original Message- From: tutor-bounces+steve.flynn=capita.co...@python.org [mailto:tutor- bounces+steve.flynn=capita.co...@python.org] On Behalf Of Steven D'Aprano Sent: Saturday, July 14, 2012 2:42 AM To: tutor@python.org Subject: Re: [Tutor] converting EBCIDIC to ASCII Prinn, Craig wrote: I am trying to convert an EBCIDIC file to ASCII, when the records are fixed length I can convert it fine, I have some files that are coming in as variable length records, is there a way to convert the file in Python? I tried using no length but then it just reads in to a fixed buffer size and I can't seem to break the records properly I'm afraid that I have no idea what you mean here. What are you actually doing? What does tried using no length mean? The conversion to ASCII from EBCDIC is only going to get Craig so far - depending on how the sender transferred the files to him, there's a very good chance that the 4 byte RDW (Record Descriptor Word) has been stripped off the start of each record. This 4 byte RDW should indicate that the next N bytes belong to this record. Without it, you have no way of determining how long the current record should be and thus where the next RDW should be. This makes finding the start and end of records tricky to say the least. I've written to Craig off list with some info as it's not particularly relevant to Python, other than letting python do the work of iconv. S. This email and any attachment to it are confidential. Unless you are the intended recipient, you may not use, copy or disclose either the message or any information contained in the message. If you are not the intended recipient, you should delete this email and notify the sender immediately. Any views or opinions expressed in this email are those of the sender only, unless otherwise stated. All copyright in any Capita material in this email is reserved. All emails, incoming and outgoing, may be recorded by Capita and monitored for legitimate business purposes. Capita exclude all liability for any loss or damage arising or resulting from the receipt, use or transmission of this email to the fullest extent permitted by law. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] converting EBCIDIC to ASCII
I am trying to convert an EBCIDIC file to ASCII, when the records are fixed length I can convert it fine, I have some files that are coming in as variable length records, is there a way to convert the file in Python? I tried using no length but then it just reads in to a fixed buffer size and I can't seem to break the records properly Craig Prinn Manager, Data Management Phone: 919-767-6640 Cell: 410-320-9962 Address: Bell and Howell 3600 Clipper Mill Road Suite 404 Baltimore MD 21211 ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] converting EBCIDIC to ASCII
On Thu, Jul 5, 2012 at 9:30 AM, Prinn, Craig craig.pr...@bhemail.comwrote: ** ** ** ** ** ** I am trying to convert an EBCIDIC file to ASCII, when the records are fixed length I can convert it fine, I have some files that are coming in as variable length records, is there a way to convert the file in Python? I tried using no length but then it just reads in to a fixed buffer size and I can’t seem to break the records properly I know of only three varieties of variable-length-record files: - Delimited - i.e. there's some special character that ends the record, and (perhaps) a special character that separates fields. CSV is the classic example: newlines to separate records, commas to separate fields. - Prefixed - there's a previously-agreed schema of record lengths, where (for example) a record that starts with A is 25 characters long; a B record is 136 characters long, etc. - Sequential - record types/lengths appear in a previously-agreed order, such as 25 characters, 136 characters, 45 characters, etc. For each of these types, the schema may be externally-published, or it may be encoded in a special record at the beginning of the file - to use an example near and dear to my own experience, ANSI X12 EDI files all start with a fixed-length ISA record, which among other things contains the element separator, repetition separator, sub-element separator, and segment terminator characters in positions 3, 104, 84, and 105. To read an X12 file, therefore, you read it in - look at positions 3,84, 104, and 105 - and then use that information to break up the rest of the file into records and fields. How you handle variable-length records depends on what kind they are, and how much you know about them going in. Python is just a tool for applying your specialized domain knowledge - by itself, it doesn't know any more about your particular solution than you do. If you have more information about the structure of your files, and need help implementing an algorithm to deal with 'em, let us know! ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] converting EBCIDIC to ASCII
On Fri, Jul 13, 2012 at 1:28 PM, Prinn, Craig craig.pr...@bhemail.comwrote: The records are coming off of a mainframe, so there probably was a 2 byte RDW or length indicator at one point. If there is a x0D x0A at the end would that work? Thanks Craig I presume so, but (despite my bloviating about the generalities of variable-length records) I don't actually know all that much about how systems that use EBCDIC tend to structure their files (my big iron days were in an HP 3000 shop, which DID use EBCDIC, but that was 22 years ago - and at the time I was a database-only programmer and didn't need to worry my little head about actual file I/O.) By at the end do you mean 'at the end of each record', or 'at the end of the file'? If you meant 'at the end of each record', then my approach would be: - create an empty list called lines - read in the file (or buffer-sized chunks of it, anyway) - call it inFile - create recordBegin and recordEnd pointers, initialized to 0 - search for x0D x0A (or whatever) in the stream of bytes - each time I find it, - set the recordEnd pointer - make a copy of the bytes between recordBegin and recordEnd and append it to lines - copy recordEnd to recordBegin - lather, rinse, repeat - at the end, decode each bytestream in lines If you meant 'at the end of the file', then I'm not sure it helps, and I don't know what you'd need to move forward. Good luck! ** ** -- *From:* Marc Tompkins [mailto:marc.tompk...@gmail.com] *Sent:* Friday, July 13, 2012 3:30 PM *To:* Prinn, Craig *Cc:* tutor@python.org *Subject:* Re: [Tutor] converting EBCIDIC to ASCII ** ** On Thu, Jul 5, 2012 at 9:30 AM, Prinn, Craig craig.pr...@bhemail.com wrote: I am trying to convert an EBCIDIC file to ASCII, when the records are fixed length I can convert it fine, I have some files that are coming in as variable length records, is there a way to convert the file in Python? I tried using no length but then it just reads in to a fixed buffer size and I can’t seem to break the records properly ** ** I know of only three varieties of variable-length-record files: - Delimited - i.e. there's some special character that ends the record, and (perhaps) a special character that separates fields. CSV is the classic example: newlines to separate records, commas to separate fields. - Prefixed - there's a previously-agreed schema of record lengths, where (for example) a record that starts with A is 25 characters long; a B record is 136 characters long, etc. - Sequential - record types/lengths appear in a previously-agreed order, such as 25 characters, 136 characters, 45 characters, etc. For each of these types, the schema may be externally-published, or it may be encoded in a special record at the beginning of the file - to use an example near and dear to my own experience, ANSI X12 EDI files all start with a fixed-length ISA record, which among other things contains the element separator, repetition separator, sub-element separator, and segment terminator characters in positions 3, 104, 84, and 105. To read an X12 file, therefore, you read it in - look at positions 3,84, 104, and 105 - and then use that information to break up the rest of the file into records and fields. How you handle variable-length records depends on what kind they are, and how much you know about them going in. Python is just a tool for applying your specialized domain knowledge - by itself, it doesn't know any more about your particular solution than you do. If you have more information about the structure of your files, and need help implementing an algorithm to deal with 'em, let us know! ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] converting EBCIDIC to ASCII
Prinn, Craig wrote: I am trying to convert an EBCIDIC file to ASCII, when the records are fixed length I can convert it fine, I have some files that are coming in as variable length records, is there a way to convert the file in Python? I tried using no length but then it just reads in to a fixed buffer size and I can't seem to break the records properly I'm afraid that I have no idea what you mean here. What are you actually doing? What does tried using no length mean? Converting from one encoding to another should have nothing to do with whether they are fixed-length records, variable-length records, or free-form text. First you read the file as bytes, then use the encoding to convert to text, then process the file however you like. Using Python 3, I prepared an EBCIDIC file. If I open it in binary mode, you get the raw bytes, which are a mess: py raw = open('/home/steve/ebcidic.text', 'rb').read() py print(raw) b'\xe3\x88\x89\xa2@\x89\xa2@\\\xa2\x96\x94\x85\\@\xe3 ... For brevity, I truncated the output. But if you open in text mode, and set the encoding correctly, Python automatically converts the bytes into text according to the rules of EBCIDIC: py text = open('/home/steve/ebcidic.text', 'r', encoding='cp500').read() py print(text) This is *some* Text containing punctuation other things(!) which may{?} NOT be the +++same+++ when encoded into ASCII|EBCIDIC. This is especially useful if you need to process the file line by line. Simple open the file with the right encoding, then loop over the file as normal. f = open('/home/steve/ebcidic.text', 'r', encoding='cp500') for line in f: print(line) In this case, I used IBM's standard EBCIDIC encoding for Western Europe. Python knows about some others, see the documentation for the codecs module for the list. http://docs.python.org/library/codecs.html http://docs.python.org/py3k/library/codecs.html Once you have the text, you can then treat it as fixed width, variable width, or whatever else you might have. Python 2 is a little trickier. You can manually decode the bytes: # not tested text = open('/home/steve/ebcidic.text', 'rb').read().decode('cp500') or you can use the codecs manual to get very close to the same functionality as Python 3: # also untested import codecs f = codecs.open('/home/steve/ebcidic.text', 'r', encoding='cp500') for line in f: print line -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Converting ebcidic to ascii
Thanks Mark that did the trick, couldn't quite figure out the syntax before. Craig Prinn Document Solutions Manager Office Phone 919-767-6640 Cell Phone410-320-9962 Fax 410-243-0973 3600 Clipper Mill Road Suite 404 Baltimore, MD 21211 -Original Message- From: tutor-bounces+craig.prinn=bowebellhowell@python.org [mailto:tutor-bounces+craig.prinn=bowebellhowell@python.org] On Behalf Of tutor-requ...@python.org Sent: Wednesday, June 15, 2011 6:00 AM To: tutor@python.org Subject: Tutor Digest, Vol 88, Issue 54 Send Tutor mailing list submissions to tutor@python.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.python.org/mailman/listinfo/tutor or, via email, send a message with subject or body 'help' to tutor-requ...@python.org You can reach the person managing the list at tutor-ow...@python.org When replying, please edit your Subject line so it is more specific than Re: Contents of Tutor digest... Today's Topics: 1. Re: Already Initialized Object Inheritance? (WolfRage) 2. Re: trying to translate and ebcidic file (Mark Tolonen) -- Message: 1 Date: Tue, 14 Jun 2011 23:42:59 -0700 From: WolfRage wolfrage8...@gmail.com To: Japhy Bartlett ja...@pearachute.com Cc: Python Tutor tutor@python.org Subject: Re: [Tutor] Already Initialized Object Inheritance? Message-ID: 1308120179.1952.50.camel@wolfrage-LE1600 Content-Type: text/plain; charset=UTF-8 Unfortunately I am not able to inherit stdscr using that method. As Python returns with an error stating that stdscr is not defined. This error is returned at run time and by the compiler prior to actual execution. If you would like I can write a quick example that will generate the error message for that method. -- Jordan On Wed, 2011-06-15 at 02:04 -0400, Japhy Bartlett wrote: When you're subclassing something, you use the syntax: class Foo(Bar): It seems like you're trying to do: class Bar: class Foo: - Japhy On Wed, Jun 15, 2011 at 12:47 AM, WolfRage wolfrage8...@gmail.com wrote: I can not get this to behave in the manor that I would like. I am trying to have an object refereed to as CursesApp.Screen become the already initialized object stdscr. To elaborate I would like it to become that object but to also be able to define additional methods and properties, so more along the lines of inherit from stdscr. Is this even possible? Well I can make it equal to that object I can not add additional methods and properties to it? Additionally, so that I learn; where has my thinking been too short sited? Thank you for your help. -- Jordan CODE BELOW #!/usr/bin/python3 With thi method I can make the class Screen become stdscr but if I refernce any of the new methods or properties the applications promptly fails and notifies me that the method or property does not exist. Another downside of this method is I can not reference self.Screen.* or it crashes. import curses class CursesApp: def __init__(self, stdscr): self.Screen(stdscr) #This is the stdscr object. curses.init_pair(1,curses.COLOR_BLUE,curses.COLOR_YELLOW) #self.Screen.bkgd(' ', curses.color_pair(1)) #self.mainLoop() #def mainLoop(self): #while 1: #self.Screen.refresh() #key=self.Screen.getch() #if key==ord('q'): break class Screen: def __init__(self,stdscr): self=stdscr #self.height, self.width = self.getmaxyx() # any reference to these crashes #self.offsety, self.offsetx = -self.height/2, -self.width/2 # any reference to these crashes #self.curx, self.cury = 1, 1 # any reference to these crashes self.clear() self.border(0) while 1: self.refresh() key=self.getch() if key==ord('q'): break def main(): cursesapp = curses.wrapper(setup) def setup(stdscr): CursesApp(stdscr) if __name__ == '__main__': main() CODE BELOW #!/usr/bin/python3 With this method I can make Screen become stdscr but if I obviously can not even define any new methods or properties. But atleast the references can be used through out the class with out crashing. import curses class CursesApp: def __init__(self, stdscr): self.Screen=stdscr #This is the stdscr object. curses.init_pair(1,curses.COLOR_BLUE,curses.COLOR_YELLOW) self.Screen.bkgd(' ', curses.color_pair(1)) self.mainLoop() def mainLoop(self): while 1: self.Screen.refresh() key=self.Screen.getch() if key==ord('q'): break def main(): cursesapp = curses.wrapper(setup) def setup(stdscr): CursesApp(stdscr) if __name__ ==