Hi Orna, On Mon, Jul 20, 2015 at 11:56 AM, Orna Agmon Ben-Yehuda <ladyp...@gmail.com> wrote:
> Hello everyone, > > I often have damaged text files (due to a lovely storage system). The > files are of different formats, although I can usually assume they contain > spaces. The files are structured as lines. > > Every once in a while, the lovely destruction (ahm....storage) system > inserts binary garbage to the file. I wish to fix the files by removing the > cancer without leaving any leftovers. That is, I want to lose partial lines. > > I tried using grep with all sorts of keys, but it did not do the trick. > strings catches too little - it leaves partial lines. > Is there an elegant way to do the trick line-wise? > > It would help to know exactly which lines you wish to eliminate. Otherwise, you can do various tasks like that using perl -lane (while possibly using the -i flag) E.g: (untested): $ export THRESH=5 $ perl -lan -E 'print unless ((() = /([\x80-\xFF])/g) > $ENV{THRESH})' < existing-file.txt > new-file.txt The "ruby" executable has similar flags (with the Ruby’s expression syntax naturally). Hope it helps. Regards, — Shlomi Fish -- Chuck Norris helps the gods that help themselves. Please reply to list if it's a mailing list post - http://shlom.in/reply .
_______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il