Re: [gentoo-user] Join plain text paragraphs

2006-06-13 Thread Jim
* on the Tue, Jun 13, 2006 at 07:28:17AM +0100, David Morgan said:
> On 22:46 Mon 12 Jun , JimD wrote:
> > David Morgan wrote:
> > > On 18:53 Mon 12 Jun , JimD wrote:
> > >> Sweet.  Thanks for the tips.  I need to start using OOo more ;-)
> > > 
> > > No need.
> > > 
> > > sed -e :a -e '$!N;s/\n[^$]//;ta' -e 'p;D' filename
> > 
> > Close.  It is removing the first character of every paragraph.  I am
> > trying to digitize my book collection.  For example, here is a test
> > output from Narnia - The Magician's Nephew:
> 
> Indeed - didn't my corrected version get through? I received it before I
> received your reply anyway.
> 
> sed -e :a -e '$!N;s/\n\([^$]\)/\1/;ta' -e 'p;D' filename

Almost perfect.  It now joins the lines without removing the first
character.  However, There is now no space between the joined lines.
For example:


CHAPTER ONE
THE WRONG DOOR

becomes

CHAPTER ONETHE WRONG DOOR

I added space to the end of all lines, except blank lines and now it
gets me pretty much what I was looking for.

Thanks,

Jim
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Join plain text paragraphs

2006-06-12 Thread David Morgan
On 22:46 Mon 12 Jun , JimD wrote:
> David Morgan wrote:
> > On 18:53 Mon 12 Jun , JimD wrote:
> >> Sweet.  Thanks for the tips.  I need to start using OOo more ;-)
> > 
> > No need.
> > 
> > sed -e :a -e '$!N;s/\n[^$]//;ta' -e 'p;D' filename
> 
> Close.  It is removing the first character of every paragraph.  I am
> trying to digitize my book collection.  For example, here is a test
> output from Narnia - The Magician's Nephew:

Indeed - didn't my corrected version get through? I received it before I
received your reply anyway.

sed -e :a -e '$!N;s/\n\([^$]\)/\1/;ta' -e 'p;D' filename

-- 
Join The no2id Coalition, http://www.no2id.net/

djm
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Join plain text paragraphs

2006-06-12 Thread JimD
David Morgan wrote:
> On 18:53 Mon 12 Jun , JimD wrote:
>> Sweet.  Thanks for the tips.  I need to start using OOo more ;-)
> 
> No need.
> 
> sed -e :a -e '$!N;s/\n[^$]//;ta' -e 'p;D' filename

Close.  It is removing the first character of every paragraph.  I am
trying to digitize my book collection.  For example, here is a test
output from Narnia - The Magician's Nephew:

=== cut here ==
CHAPTER ONE
THE WRONG DOOR

This is a story about something that happened long ago when your
grandfather was a child. It is a very important story because it shows
how all the comings and goings between our own world and the land of
Narnia first began.

In those days Mr Sherlock Holmes was still living in Baker Street and
the Bastables were looking for treasure in the Lewisham Road. In those
days, if you were a boy you had to wear a stiff Eton collar every day,
and schools were usually nastier than now. But meals were nicer; and
as for sweets, I won't tell you how cheap and good they were, because
it would only make your mouth water in vain. And in those days there
lived in London a girl called Polly Plummer.

She lived in one of a long row of houses which were all joined together.
One morning she was out in the back garden when a boy scrambled up from
the garden next door and put his face over the wall. Polly was very
surprised because up till now there had never been any children in that
house, but only Mr Ketterley and Miss Ketterley, a brother and sister,
old bachelor and old maid, living together. So she looked up, full of
curiosity. The face of the strange boy was very grubby. It could hardly
have been grubbier if he had first rubbed his hands in the earth, and
then had a good cry, and then dried his face with his hands. As a matter
of fact, this was very nearly what he had been doing.
=== cut here ==

Jim
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
You roll an 18 in Dex and see if you
don't end up with a girlfriend
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
JimD
Central FL, USA, Earth, Sol
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Join plain text paragraphs

2006-06-12 Thread David Morgan
On 00:13 Tue 13 Jun , David Morgan wrote:
> sed -e :a -e '$!N;s/\n[^$]//;ta' -e 'p;D' filename

Gosh, what was I thinking?

sed -e :a -e '$!N;s/\n\([^$]\)/\1/;ta' -e 'p;D' filename

I expect there's a slightly nicer way, but I'm tired and I have an exam
in the morning...

-- 
Join The no2id Coalition, http://www.no2id.net/

djm
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Join plain text paragraphs

2006-06-12 Thread David Morgan
On 18:53 Mon 12 Jun , JimD wrote:
> Sweet.  Thanks for the tips.  I need to start using OOo more ;-)

No need.

sed -e :a -e '$!N;s/\n[^$]//;ta' -e 'p;D' filename

-- 
Join The no2id Coalition, http://www.no2id.net/

djm
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Join plain text paragraphs

2006-06-12 Thread JimD
Alan McKinnon wrote:
> On Monday 12 June 2006 19:22, JimD wrote:
>> I have an MS Word "HTML" file.  I used Lynx to dump it to text and
>> now I want to get it to pdf.  I opened it in OOo and saved as an
>> OpenDocument. However, all the paragraphs are hard wrapped at 80
>> characters so the text does not take up the whole page.
>>
>> Is there an easy way to go through the 100+ pages and just join the
>> lines of each paragraph so that they will be flowed correctly in
>> OOo?
>>
>> I have the dumped text file and the OOo file and both have the
>> paragraphs hard wrapped at column 80.  I would think there would
>> have to be some simple tool out there to go through the plain text
>> file and just join all the lines of a paragraph, no?
> 
> You already have a OOo file so that's a good place to start.
> 
> First, check on Tools -> Autocorrect -> Options that "Remove blank 
> paragraphs" is checked. Then highlight all the text you want to 
> modify and do Format -> AutoFormat -> Apply.
> 
> This should remove hard line returns in the middle of paras then 
> remove blank paras. Then print to pdf.

Sweet.  Thanks for the tips.  I need to start using OOo more ;-)

Jim
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
You roll an 18 in Dex and see if you
don't end up with a girlfriend
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
JimD
Central FL, USA, Earth, Sol
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Join plain text paragraphs

2006-06-12 Thread Alan McKinnon
On Monday 12 June 2006 19:22, JimD wrote:
> I have an MS Word "HTML" file.  I used Lynx to dump it to text and
> now I want to get it to pdf.  I opened it in OOo and saved as an
> OpenDocument. However, all the paragraphs are hard wrapped at 80
> characters so the text does not take up the whole page.
>
> Is there an easy way to go through the 100+ pages and just join the
> lines of each paragraph so that they will be flowed correctly in
> OOo?
>
> I have the dumped text file and the OOo file and both have the
> paragraphs hard wrapped at column 80.  I would think there would
> have to be some simple tool out there to go through the plain text
> file and just join all the lines of a paragraph, no?

You already have a OOo file so that's a good place to start.

First, check on Tools -> Autocorrect -> Options that "Remove blank 
paragraphs" is checked. Then highlight all the text you want to 
modify and do Format -> AutoFormat -> Apply.

This should remove hard line returns in the middle of paras then 
remove blank paras. Then print to pdf.

-- 
If only me, you and dead people understand hex, 
how many people understand hex?

Alan McKinnon
alan at linuxholdings dot co dot za
+27 82, double three seven, one nine three five
-- 
gentoo-user@gentoo.org mailing list



[gentoo-user] Join plain text paragraphs

2006-06-12 Thread JimD
I have an MS Word "HTML" file.  I used Lynx to dump it to text and now I
want to get it to pdf.  I opened it in OOo and saved as an OpenDocument.
 However, all the paragraphs are hard wrapped at 80 characters so the
text does not take up the whole page.

Is there an easy way to go through the 100+ pages and just join the
lines of each paragraph so that they will be flowed correctly in OOo?

I have the dumped text file and the OOo file and both have the
paragraphs hard wrapped at column 80.  I would think there would have to
be some simple tool out there to go through the plain text file and just
join all the lines of a paragraph, no?

Thanks,

Jim
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
You roll an 18 in Dex and see if you
don't end up with a girlfriend
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
JimD
Central FL, USA, Earth, Sol
-- 
gentoo-user@gentoo.org mailing list