Khalid Moulfi wrote: > > thanks for your quick answer. > Here is a sample of the first line of the NIST file : > > 1.001:00000002451.002:30001.003:1192041424344454647484941041141241341415151516151715181.004:NPS1.005:200810291.006:41.007:51/Live > Scan1.008:51/Live > Scan1.009:0844251404U1.011:19.68501.012:19.68502.001:00000001882.002:02.003:30002.010:10050001902.019:200810292.029:02.054:Civilian2.083:01NA02NA03NA04NA05NA06NA07NA08NA09NA10NA2.233:ÈÏæä2.235:1011973400606 > > but as the end of the line is not displayed, I send you a copy of the > file with all the line.
That's because your file contains null bytes ('\x00'). The string you display above shows everything up to the first null. > The thing is even if I take the number of character from let's say > 2.001 to the end of the line I do not get the real number of charatcer. What do you mean by that? Where did the numbers come from? The file contains one line of 471 bytes, including the newline. Does that agree with either of your sources? > My goal is to modify this first line by adding new tag (with special > character), suppress some of them, get the real number of length and > after all this update to modify it in the original nst file. > > I'll try as you said to open it with rb parameters and see. You will have to show me your code, along with what numbers you expect. The file you sent is 471 bytes long, and that's exactly what I read, in both text and binary modes: C:\tmp>python Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> x = open('sample_1005000190.nst') >>> y = open('sample_1005000190.nst', 'rb') >>> x1 = x.read() >>> y1 = y.read() >>> len(x1) 471 >>> len(y1) 471 >>> x1.find('2.001') 245 >>> x1[-2:] '\x00\n' >>> y1[-2:] '\x00\n' >>> x.seek(0,0) >>> x2 = x.readlines() >>> len(x2) 1 >>> len(x2[0]) 471 >>> The "2.001" is located at byte 245, so there should are 126 bytes from there to the end of the line. However, there are zero bytes (meaning '\x00') in this file, which might be confusing you. You have to know something about this data format to know how to modify it. It looks like the file consists of two major sections, separated by 0x1C characters. The major sections are then divided into records separated by 0x1D characters. Some of the records have fields in them, separated by 0x1E. There are 38 bytes of what look like garbage after the last field. So, you could parse it into records like this: Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> x = open('sample_1005000190.nst','rb').read() >>> sections = x.split('\x1c') >>> len(sections) 3 >>> [len(k) for k in sections] [244, 187, 38] >>> rec1 = sections[0].split('\x1d') >>> rec2 = sections[1].split('\x1d') >>> len(rec1) 11 >>> len(rec2) 10 >>> rec1 ['1.001:0000000245', '1.002:3000', '1.003:1\x1f19\x1e2\x1f0\x1e4\x1f1\x1e4\x1f2\ x1e4\x1f3\x1e4\x1f4\x1e4\x1f5\x1e4\x1f6\x1e4\x1f7\x1e4\x1f8\x1e4\x1f9\x1e4\x1f10 \x1e4\x1f11\x1e4\x1f12\x1e4\x1f13\x1e4\x1f14\x1e15\x1f15\x1e15\x1f16\x1e15\x1f17 \x1e15\x1f18', '1.004:NPS', '1.005:20081029', '1.006:4', '1.007:51/Live Scan', ' 1.008:51/Live Scan', '1.009:0844251404U', '1.011:19.6850', '1.012:19.6850'] >>> rec2 ['2.001:0000000188', '2.002:0', '2.003:3000', '2.010:1005000190', '2.019:2008102 9', '2.029:0', '2.054:Civilian', '2.083:01\x1fNA\x1e02\x1fNA\x1e03\x1fNA\x1e04\x 1fNA\x1e05\x1fNA\x1e06\x1fNA\x1e07\x1fNA\x1e08\x1fNA\x1e09\x1fNA\x1e10\x1fNA', ' 2.233:\xc8\xcf\xe6\xe4', '2.235:1011973400606'] >>> Here, "sections" contains the three major sections. "rec1" contains the records from the first section. If you wanted to add a "1.013" record to the first section, you could say: rec1.append( "1.013:Cool Beans" ) and then rebuild the file by saying: newsections = ['\x1d'.join(rec1), '\x1d'.join(rec2), sections[2]] open('newfile.nst','wb').write ('\x1c'.join(newsections) ) But that assumes there's nothing in that garbage 3rd section that needs to be changed. It's just a matter of dividing the problem up into smaller problems until the solution pops out. -- Tim Roberts, t...@probo.com Providenza & Boekelheide, Inc. _______________________________________________ python-win32 mailing list python-win32@python.org http://mail.python.org/mailman/listinfo/python-win32