Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-29 Thread Martin Visser

On 1/27/07, Steve Kowalik [EMAIL PROTECTED] wrote:



Just to throw my 2 cents in, tr(1) can do this.

tr -s ' '  oldfile  newfile



This would nuke *all* of the spaces in the document. Luke clarifed in
a follow-up post that it was only whitespace butting up against quotes
that needed to be removed. Unfortunately tr doesn't deal with context
whereas sed (and perl/python) can.

--
Regards, Martin

Martin Visser
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-26 Thread Steve Kowalik
On Fri, 26 Jan 2007 09:50:11 +1100, Luke Yelavich uttered
 I have been given a spreadsheet, which I have exported to a csv file, 
 for use with a community website I am working on.
 
 I have a nasty problem, where a lot of the sells have massive amounts of 
 whitespace. I am wondering whether there is a quick way to remove the 
 whitespace, either in the spreadsheet, or in the csv file? I am guessing 
 a regular expression could do it, but I am no regular expression expert.
 
Just to throw my 2 cents in, tr(1) can do this.

tr -s ' '  oldfile  newfile

Cheers,
-- 
Steve
I'm a doctor, not a doorstop
 - EMH, USS Enterprise
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


[SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-25 Thread Luke Yelavich
Hi all
I have been given a spreadsheet, which I have exported to a csv file, 
for use with a community website I am working on.

I have a nasty problem, where a lot of the sells have massive amounts of 
whitespace. I am wondering whether there is a quick way to remove the 
whitespace, either in the spreadsheet, or in the csv file? I am guessing 
a regular expression could do it, but I am no regular expression expert.

Thanks in advance.
-- 
Luke Yelavich
GPG key: 0xD06320CE 
 (http://www.themuso.com/themuso-gpg-key.txt)
Email  MSN: [EMAIL PROTECTED]
Jabber: [EMAIL PROTECTED]


signature.asc
Description: Digital signature
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-25 Thread Sonia Hamilton

Luke Yelavich wrote:

Hi all
I have been given a spreadsheet, which I have exported to a csv file, 
for use with a community website I am working on.


I have a nasty problem, where a lot of the sells have massive amounts of 
whitespace. I am wondering whether there is a quick way to remove the 
whitespace, either in the spreadsheet, or in the csv file? I am guessing 
a regular expression could do it, but I am no regular expression expert.


Sed is probably your friend here (though you could also try tr, awk, 
perl, python, ...).


I have a whole lot of sed links here [1], that lead off to sed 
cheatsheets, etc.


[1] http://www.snowfrog.net/?q=taxonomy/term/41

--
Sonia Hamilton. GPG key A8B77238.
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-25 Thread Luke Yelavich
On Fri, Jan 26, 2007 at 09:50:11AM EST, Luke Yelavich wrote:
 Hi all
 I have been given a spreadsheet, which I have exported to a csv file, 
 for use with a community website I am working on.
 
 I have a nasty problem, where a lot of the sells have massive amounts of 
 whitespace. I am wondering whether there is a quick way to remove the 
 whitespace, either in the spreadsheet, or in the csv file? I am guessing 
 a regular expression could do it, but I am no regular expression expert.

I should clarrify my needs a bit. There are text strings in this file 
that have spaces between words. The whitespace is only at the end of 
fields, and all text strings have quotes around them. I also think this 
whitespace is between the last character, and the quote in the text 
string.
-- 
Luke Yelavich
GPG key: 0xD06320CE 
 (http://www.themuso.com/themuso-gpg-key.txt)
Email  MSN: [EMAIL PROTECTED]
Jabber: [EMAIL PROTECTED]


signature.asc
Description: Digital signature
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-25 Thread Alex Samad
On Fri, Jan 26, 2007 at 10:58:09AM +1100, Sonia Hamilton wrote:
 Luke Yelavich wrote:
 Hi all
 I have been given a spreadsheet, which I have exported to a csv file, 
 for use with a community website I am working on.
 
 I have a nasty problem, where a lot of the sells have massive amounts of 
 whitespace. I am wondering whether there is a quick way to remove the 
 whitespace, either in the spreadsheet, or in the csv file? I am guessing 
 a regular expression could do it, but I am no regular expression expert.
 
 Sed is probably your friend here (though you could also try tr, awk, 
 perl, python, ...).
 
 I have a whole lot of sed links here [1], that lead off to sed 
 cheatsheets, etc.
 
 [1] http://www.snowfrog.net/?q=taxonomy/term/41

something like sed -e 's/  +/ /g'  newfile

says something like find a space if there is atleast 1 more space (up to all
the spaces) change that into 1 space.

the difference between * and +.  * would say 0 or more + is 1 or more

 
 -- 
 Sonia Hamilton. GPG key A8B77238.
 -- 
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
 


signature.asc
Description: Digital signature
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-25 Thread Alex Samad
On Fri, Jan 26, 2007 at 11:33:33AM +1100, Luke Yelavich wrote:
 On Fri, Jan 26, 2007 at 09:50:11AM EST, Luke Yelavich wrote:
  Hi all
  I have been given a spreadsheet, which I have exported to a csv file, 
  for use with a community website I am working on.
  
  I have a nasty problem, where a lot of the sells have massive amounts of 
  whitespace. I am wondering whether there is a quick way to remove the 
  whitespace, either in the spreadsheet, or in the csv file? I am guessing 
  a regular expression could do it, but I am no regular expression expert.
 
 I should clarrify my needs a bit. There are text strings in this file 
 that have spaces between words. The whitespace is only at the end of 
 fields, and all text strings have quotes around them. I also think this 
 whitespace is between the last character, and the quote in the text 
 string.

sed -e 's/ +//' oldfile  newfile 

 -- 
 Luke Yelavich
 GPG key: 0xD06320CE 
(http://www.themuso.com/themuso-gpg-key.txt)
 Email  MSN: [EMAIL PROTECTED]
 Jabber: [EMAIL PROTECTED]



 -- 
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


signature.asc
Description: Digital signature
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-25 Thread Luke Yelavich
On Fri, Jan 26, 2007 at 11:36:10AM EST, Alex Samad wrote:
 sed -e 's/ +//' oldfile  newfile 

It seems that changing the + to an * did the trick. Thanks.
-- 
Luke Yelavich
GPG key: 0xD06320CE 
 (http://www.themuso.com/themuso-gpg-key.txt)
Email  MSN: [EMAIL PROTECTED]
Jabber: [EMAIL PROTECTED]


signature.asc
Description: Digital signature
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-25 Thread Alex Samad
On Fri, Jan 26, 2007 at 12:05:06PM +1100, Luke Yelavich wrote:
 On Fri, Jan 26, 2007 at 11:36:10AM EST, Alex Samad wrote:
  sed -e 's/ +//' oldfile  newfile 
 
 It seems that changing the + to an * did the trick. Thanks.
strange 

the above should work for 

test= test

and shouldn't do anything for 

test = test

I tried some tests your right, but when I escaped out the + so it looks like 

sed -e '/s/ \+//' 

will have to investigate later 


 -- 
 Luke Yelavich
 GPG key: 0xD06320CE 
(http://www.themuso.com/themuso-gpg-key.txt)
 Email  MSN: [EMAIL PROTECTED]
 Jabber: [EMAIL PROTECTED]



 -- 
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


signature.asc
Description: Digital signature
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-25 Thread Alex Samad
On Fri, Jan 26, 2007 at 12:13:49PM +1100, Alex Samad wrote:
 On Fri, Jan 26, 2007 at 12:05:06PM +1100, Luke Yelavich wrote:
  On Fri, Jan 26, 2007 at 11:36:10AM EST, Alex Samad wrote:
   sed -e 's/ +//' oldfile  newfile 
  
  It seems that changing the + to an * did the trick. Thanks.
 strange 
 
 the above should work for 
 
 test= test
 
 and shouldn't do anything for 
 
 test = test
 
 I tried some tests your right, but when I escaped out the + so it looks like 
 
 sed -e '/s/ \+//' 
 
 will have to investigate later 

some more test with perl


[EMAIL PROTECTED]:~$ echo 'test' | perl -pe  's/ \+//'
test
[EMAIL PROTECTED]:~$ echo 'test' | perl -pe  's/ +//'
test

so sed is doing something with the +

 
 
  -- 
  Luke Yelavich
  GPG key: 0xD06320CE 
   (http://www.themuso.com/themuso-gpg-key.txt)
  Email  MSN: [EMAIL PROTECTED]
  Jabber: [EMAIL PROTECTED]
 
 
 
  -- 
  SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
  Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html



 -- 
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


signature.asc
Description: Digital signature
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] A quick way to remove masses of whitespace from a csv file.

2007-01-25 Thread Jeff Waugh
quote who=Luke Yelavich

 I should clarrify my needs a bit. There are text strings in this file that
 have spaces between words. The whitespace is only at the end of fields,
 and all text strings have quotes around them. I also think this whitespace
 is between the last character, and the quote in the text string.

There have been some answers already, but here's something else you could
try if the whitespace is not just made of spaces...

  sed 's#[[:space:]]*##g'  file.txt

:-)

- Jeff

-- 
Open CeBIT 2007: Sydney, Australia  http://www.opencebit.com.au/
 
   I don't know whose brain child it was, but it was quite an ugly child.
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html