Re: Extending R.E. Syntax (was: Re: Eliminating binary from a text file)

2015-07-21 Thread Shlomi Fish
Hi Omer, On Mon, Jul 20, 2015 at 9:46 PM, Omer Zak w...@zak.co.il wrote: Instead of creating a separate bgrep, it would have been better to be able to extend the syntax of regular expressions (in egrep, Perl and other platforms) to allow specification of binary strings having arbitrary

Re: Eliminating binary from a text file

2015-07-21 Thread Orna Agmon Ben-Yehuda
Amos, we have a winner!!! Exactly what I looked for! Thanks Orna On Tue, Jul 21, 2015 at 7:53 AM, Amos Shapira amos.shap...@gmail.com wrote: Then how about: grep -v -P -a '\x00' file? Based on http://superuser.com/a/612336/27453. Explantion of the flags: -v - inverse - print NON-matching

Re: Eliminating binary from a text file

2015-07-20 Thread Dov Grobgeld
Why not do it through a short python script? Something like (not tested) import os for dirpath, dirnames, filenames in os.walk('damagedfilesystem'): for fn in filenames: if fn.endswith('.txt'): new_fn = fn.replace('.txt','-fixed.txt') out_fh = open(new_fn,'w') for

Re: Eliminating binary from a text file

2015-07-20 Thread Shlomi Fish
Hi Orna, On Mon, Jul 20, 2015 at 11:56 AM, Orna Agmon Ben-Yehuda ladyp...@gmail.com wrote: Hello everyone, I often have damaged text files (due to a lovely storage system). The files are of different formats, although I can usually assume they contain spaces. The files are structured as

Eliminating binary from a text file

2015-07-20 Thread Orna Agmon Ben-Yehuda
Hello everyone, I often have damaged text files (due to a lovely storage system). The files are of different formats, although I can usually assume they contain spaces. The files are structured as lines. Every once in a while, the lovely destruction (ahmstorage) system inserts binary garbage

Re: Eliminating binary from a text file

2015-07-20 Thread Rabin Yasharzadehe
can you provide a example of a bad lines and how do you like them to look like after you fix them ? -- Rabin On Mon, Jul 20, 2015 at 11:56 AM, Orna Agmon Ben-Yehuda ladyp...@gmail.com wrote: Hello everyone, I often have damaged text files (due to a lovely storage system). The files are of

Extending R.E. Syntax (was: Re: Eliminating binary from a text file)

2015-07-20 Thread Omer Zak
Instead of creating a separate bgrep, it would have been better to be able to extend the syntax of regular expressions (in egrep, Perl and other platforms) to allow specification of binary strings having arbitrary length by means of an hex string. This would come instead of making it very

Re: Eliminating binary from a text file

2015-07-20 Thread Shachar Shemesh
On 20/07/15 11:56, Orna Agmon Ben-Yehuda wrote: Hello everyone, I often have damaged text files (due to a lovely storage system). The files are of different formats, although I can usually assume they contain spaces. The files are structured as lines. Every once in a while, the lovely

Re: Eliminating binary from a text file

2015-07-20 Thread Boruch Baum
I see that I'm late to the discussion and that your original problem has morphed a bit. Maybe the simplest and oldest solution is the `tr -d' command. See `man tr'. On 07/20/2015 04:56 AM, Orna Agmon Ben-Yehuda wrote: Hello everyone, I often have damaged text files (due to a lovely storage

Re: Eliminating binary from a text file

2015-07-20 Thread Orna Agmon Ben-Yehuda
The bad data is NULLs (I did not have hexedit, but was introduced to hexedit mode in emacs which proved useful). In the meantime, Muli Ben-Yehuda suggested to prevent the mess to begin with. The corrupted file is the output of a C program. The problem is that the program continues writing to the

Re: Eliminating binary from a text file

2015-07-20 Thread Amos Shapira
+1 for tr -d '\0' file newfile, based on the updated description. But prevention is better than a cure - find a way to avoid this in the first place. On 21 July 2015 at 07:22, Boruch Baum boruch_b...@gmx.com wrote: I see that I'm late to the discussion and that your original problem has

Re: Eliminating binary from a text file

2015-07-20 Thread Shachar Shemesh
On 21/07/15 00:22, Boruch Baum wrote: I see that I'm late to the discussion and that your original problem has morphed a bit. Maybe the simplest and oldest solution is the `tr -d' command. See `man tr'. Read the original question again. She needs to eliminate the entire line where a corruption

Re: Eliminating binary from a text file

2015-07-20 Thread Orna Agmon Ben-Yehuda
tr does what strings does - clears only the characters, not the full bad line. Not what I wanted. but now that I know what data I have there, I can clear it using any script (back to Dov's solution). Thanks everyone! On Tue, Jul 21, 2015 at 2:20 AM, Amos Shapira amos.shap...@gmail.com wrote:

Re: Eliminating binary from a text file

2015-07-20 Thread Amos Shapira
Then how about: grep -v -P -a '\x00' file? Based on http://superuser.com/a/612336/27453. Explantion of the flags: -v - inverse - print NON-matching lines -P - use Perl regexp -a - force treating the file as a text file On 21 July 2015 at 13:39, Shachar Shemesh shac...@shemesh.biz wrote: On

Re: Extending R.E. Syntax (was: Re: Eliminating binary from a text file)

2015-07-20 Thread Alexander Sukholitko
Hi, Possibly using strings file_name new_file_name command could resolve this problem. Thank you. Alex On Mon, Jul 20, 2015 at 9:46 PM, Omer Zak w...@zak.co.il wrote: Instead of creating a separate bgrep, it would have been better to be able to extend the syntax of regular expressions (in