Hi Orna,

On Mon, Jul 20, 2015 at 11:56 AM, Orna Agmon Ben-Yehuda <ladyp...@gmail.com>
wrote:

> Hello everyone,
>
> I often have damaged text files (due to a lovely storage system). The
> files are of different formats, although I can usually assume they contain
> spaces. The files are structured as lines.
>
> Every once in a while, the lovely destruction (ahm....storage) system
> inserts binary garbage to the file. I wish to fix the files by removing the
> cancer without leaving any leftovers. That is, I want to lose partial lines.
>
> I tried using grep with all sorts of keys, but it did not do the trick.
> strings catches too little - it leaves partial lines.
> Is there an elegant  way to  do the trick line-wise?
>
>
It would help to know exactly which lines you wish to eliminate. Otherwise,
you can do various tasks like that using perl -lane (while possibly using
the -i flag) E.g: (untested):

$ export THRESH=5
$ perl -lan -E 'print unless ((() = /([\x80-\xFF])/g) > $ENV{THRESH})' <
existing-file.txt > new-file.txt

The "ruby" executable has similar flags (with the Ruby’s expression syntax
naturally).

Hope it helps.

Regards,

— Shlomi Fish


-- 
Chuck Norris helps the gods that help themselves.

Please reply to list if it's a mailing list post - http://shlom.in/reply .
_______________________________________________
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il

Reply via email to