Ralph Shumaker wrote: > I want to do this: > cat myFile | sed -e "s/[ ]*CTRL-M/\n/g" > myFileCleaned > where Ctrl-M is character 0x013 and \n is a newline. > > I have a file that has many, many, many long lines, each with many sets > of data, each set being separated by many spaces, each instance of which > is ended by character 0x013, kinda like: > data ^Mdata ^Mdata ^Mdata ^Mdata ^M > but the spaces before the ^M are about 77 in number (seems to be > consistent), and the data strings are longer, containing several > elements, each separated by either one, two, or three spaces. If I can > match on any number of spaces ("[ ]*") which are immediately followed by > 0x013 (^M) and replace each instance with a newline, I'll be set (almost > certainly).
As others have said, the ^M (0x0D, or CR "carriage return") may indicate you have a DOS format file, with line endings actually being a CR,LF combination (0x0D, 0x0A). (You said 0x013, but I think you may have been confusing decimal with hex, since hex 0x0D = decimal 13.) Of course you may have a Mac-format file which uses bare CR for line delimiters. You should examine a piece of the file to find out for sure. There are several programs capable of giving (say) hex dumps -- od, hexdump, and my favorite xxd. xxd -g1 -l128 file.txt will look at the first 128 bytes and give hex (and string) output for each single byte. If you see things like > 0000000: 68 65 6c 6c 6f 0d 0a 77 6f 72 6c 64 0d 0a hello..world.. The '0d 0a' sequences confirm the DOS CR,LF format. > > I think I recall \n being the equivalent of a newline, although I may be > confusing things with my brief venture into perl. > > I did man regexp, but didn't find what I wanted. I'm not sure where > else to look. > > I'm sure that vim could probably do it, but I have already found that > trying to search for specific things in that complexity is like looking > for a tiny stainless steel needle in a humongous haystack. Magnets > won't do me any good. > > I have already had dealings with sed and regexp, and figured this would > be a good opportunity to pick up a new trick. > If you _want_ to use sed then man sed is the place to look. :-) If you do have a DOS file, then perhaps you can use this: To strip trailing space characters and the CR, you would do: sed -e's/ *CR//' file.old >file.new If your file has exceptions to the CR,LF endings or if this isn't quite what you want to do, perhaps you should explain a little more. :-) Regards, ..jim -- KPLUG-List@kernel-panic.org http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list