Re: modify a text file

2006-04-29 Thread Gerald Lai

On Sat, 29 Apr 2006, Vim Visual wrote:

[snip]

I solved it like this:

:1,/received/d
:$?^\s*For subscribe options?,$d
:let @a=''
:g/hole\|relativistic\|LISA\|black\|supermassive\|intermediate/?^\s*astro-ph?,/^\s*astro-ph/-y
A
:%d
:put a
:1d
:%s!^\s*astro-ph/\(\d\+\)!http://xxx.lanl.gov/pdf/astro-ph/\1";>&
:w! /tmp/2.html
:q

and using the

cat /tmp/1.html | vim -s foo_arXiv.vim -

[snip]

For what it's worth, here's a search regex that will search for words in
blocks delimited by #begin# and #end# on its own line:

  (all on one line)

  /^#begin#\%(
   \%(\n\(#end#\_$\)[EMAIL PROTECTED])*
 \&\%([EMAIL PROTECTED])*
 \&\%([EMAIL PROTECTED])*
   \)\_.\{-}\_^\1$

For example, if

   = foo
   = bar
   = xxx

then the regex will match

==
#begin#

On one foofine day, a regex

  found itself in the local
bar... There were many
regexxxes there to
begin
and
end
with.
#end#
==

You can add on as many words as you need according to the \& format
above. After you are satisfied with the search, you can do something
like

  :let @a = ''
  :g//.,/#end#/y A

to yank all the blocks.

There was also discussion on this mailing list earlier (between Yakov
and I) on how to search for words in a paragraph. Assuming a paragraph
is delimited by blank lines ^\s*$, we have this search regex instead:

  (all on one line)

  /^\s*\%(
   \%(\n\%(\s*\_$\)[EMAIL PROTECTED])*
 \&\%(\n\%(\s*\_$\)[EMAIL PROTECTED])*
 \&\%(\n\%(\s*\_$\)[EMAIL PROTECTED])*
   \)\_.\{-}\ze\_^\s*$

HTH :)
--
Gerald


Re: modify a text file

2006-04-29 Thread Vim Visual

The problem is that I have to tell vim to convert the end of line and
empty spaces into , and I do it like this:


I mean :

The problem is that I have to tell vim to convert the end of line and
empty LINES into , and I do it like this:


Re: modify a text file

2006-04-29 Thread Vim Visual

> when I run it I end up with an empty file... ?

Well, you should be able to trace through it by entering
each of the commands individually instead of sourcing the
file to see where things are going funky.  Places matters
could go awry:


I solved it like this:

:1,/received/d
:$?^\s*For subscribe options?,$d
:let @a=''
:g/hole\|relativistic\|LISA\|black\|supermassive\|intermediate/?^\s*astro-ph?,/^\s*astro-ph/-y
A
:%d
:put a
:1d
:%s!^\s*astro-ph/\(\d\+\)!http://xxx.lanl.gov/pdf/astro-ph/\1";>&
:w! /tmp/2.html
:q

and using the

cat /tmp/1.html | vim -s foo_arXiv.vim -



Well, we all have our peculiar areas of specialty.  When it
comes to things like plugins, syntax-highlighting nuances,
i18n/encoding stuff, and indentation, I leave that to the
folks on the list that are far more well-versed than myself.
  I'm more of an "ex" & regexp sorta guy, doing some crazy
stuff there, but otherwise, just using pretty stock
unadorned vim.


I guess I am one of the syntax-highlighting guys...

Now, as you can see, I am trying to generate an html out of the result
... the idea is to
create not a web page but an htmlized file to include it later into a
"real" web page...

The problem is that I have to tell vim to convert the end of line and
empty spaces into , and I do it like this:

:1,/received/d
:$?^\s*For subscribe options?,$d
:let @a=''
:g/hole\|relativistic\|LISA\|black\|supermassive\|intermediate/?^\s*astro-ph?,/^\s*astro-ph/-y
A
:%d
:put a
:1d
:%s!^\s*astro-ph/\(\d\+\)!http://xxx.lanl.gov/pdf/astro-ph/\1";>&
:%s!/^.//
:%s!/^$//
:w! /tmp/2.html
:q

But it doesn't work...

In case you are interested in what I am doing, try this:


#!/usr/bin/env zsh

links -dump http://lanl.arxiv.org/list/astro-ph/new | sed 's/\[abs,
ps, pdf, other\]//' | sed 's/Title:/\Title\<\/b\>/g'| sed
's/Authors:/\Autors\<\/b\>/g'  > /tmp/1.html &&

cat /tmp/1.html | vim -s ~pau/bin/foo_arXiv.vim -


where foo_arXiv.vim is posted above

And, btw, if you also tell me how to make vim embed the resulting file
into a determined position of the real web page (let's call it
real_web.html) , I will be more than grateful!!

Pau




-tim







Re: modify a text file

2006-04-29 Thread Tim Chase

cat file | vim -


What stands the "-" for?


It is a standard *nix convention of accepting stdin as the 
source for the file (in this case, the output of cat).  That 
way, we never actually bung with the original file.  If you 
don't care if it gets hosed in the process, you can just do


vim file.txt


Then, clean up the stuff we don't want

1,/received/d
$?^\s*For subscribe options?,$d

to strip off the header and footer.


this worked out nicely


Glad to hear...


My first-pass solution will end up with duplicate results if
  more than one of your keywords appear in the same "block"
but on diff. lines:

:let @a=''
:g/red\|relativistic/?^\s*astro-ph?,/^\s*astro-ph/-y A
:%d
:put a
:1d
:wq name_of_output.txt


this is the content of foo.vim? On separated lines? I also tried with
CTRL + V + ENTER at the end of each line, but the result was the
same...


foo.vim consists of all 8 lines...first the two that "worked 
out nicely" to strip the header and footer, followed by the 
6 lines in question here.  In the foo.vim file, you'd omit 
the leading colons (I don't think there's any harm in having 
them in there, but their absense makes it easier to read), 
and there's no need for extra ^M entries in the file with 
^V, as the newlines are already in the file



cat input.txt | vim -s foo.vim -


when I run it I end up with an empty file... ?


Well, you should be able to trace through it by entering 
each of the commands individually instead of sourcing the 
file to see where things are going funky.  Places matters 
could go awry:


Prob: the :g command isn't finding anything
Check: do ":echo @a" after the ":g" command to ensure that 
you've got some data


Prob:  @a has data, but it's not getting pasted
Check: that you're doing the ":put"

Prob:  No file named "name_of_output.txt" is being created
Check: that you have write permissions in the intended target
Check: that the file "name_of_output.txt" doesn't already exist
Check: that you're actually looking at the contents of the 
output file (name_of_output.txt) rather than the input file


Just a few more ideas...


I was saying in my deleted email that in any case it's a nice
lesson... When I think that people here think I know vim... what a
shame!


Well, we all have our peculiar areas of specialty.  When it 
comes to things like plugins, syntax-highlighting nuances, 
i18n/encoding stuff, and indentation, I leave that to the 
folks on the list that are far more well-versed than myself. 
 I'm more of an "ex" & regexp sorta guy, doing some crazy 
stuff there, but otherwise, just using pretty stock 
unadorned vim.


-tim






Re: modify a text file

2006-04-29 Thread Vim Visual

Hi Tim,

somehow my email was partially deleted... ??



cat file | vim -


What stands the "-" for?



Then, clean up the stuff we don't want

1,/received/d
$?^\s*For subscribe options?,$d

to strip off the header and footer.


this worked out nicely



My first-pass solution will end up with duplicate results if
  more than one of your keywords appear in the same "block"
but on diff. lines:

:let @a=''
:g/red\|relativistic/?^\s*astro-ph?,/^\s*astro-ph/-y A
:%d
:put a
:1d
:wq name_of_output.txt


this is the content of foo.vim? On separated lines? I also tried with
CTRL + V + ENTER at the end of each line, but the result was the
same...







cat input.txt | vim -s foo.vim -


when I run it I end up with an empty file... ?





I'm sorry I couldn't come up with a clean way to snag just
the unique paragraphs easily without having an instance show
up as its own result-block.


I was saying in my deleted email that in any case it's a nice
lesson... When I think that people here think I know vim... what a
shame!



Anyways, it's at least one sorta-solution to what you describe.

-tim







Re: modify a text file

2006-04-28 Thread Tim Chase

I am struggling with sed and gawk but I guess that it'd be possible to
employ vim in the command line (it's to make a script that will be
automatically launched every 24 hours) but I don't have any idea of
how to do it...

How could I select the blocks (see file ahead) of a text file (say
.txt) in which some particular words appear?
Imagine that I want to keep the blocks containing words like "black",
"supermassive", "red", "intermediate", "relativistic"...
 and delete the rest of blocks (and also the header and bottom of the file)


Well, my first thought would be to have a destroyable copy 
of the text:


   cat file | vim -

Then, clean up the stuff we don't want

   1,/received/d
   $?^\s*For subscribe options?,$d

to strip off the header and footer.

My first-pass solution will end up with duplicate results if 
 more than one of your keywords appear in the same "block" 
but on diff. lines:


   :let @a=''
   :g/red\|relativistic/?^\s*astro-ph?,/^\s*astro-ph/-y A
   :%d
   :put a
   :1d
   :wq name_of_output.txt


You can alter that 2nd line for whatever keywords you want:

   red\|relativistic\|black\|supermassive\|intermediate

If case doesn't matter, you can tack "\c" onto your search 
pattern to ignore case:


   red\|black\|supermassive\c

I don't know how it behaves with branching, so you might 
have to wrap the whole thing in parens first to make them 
all case-insensitive (maybe not):


   \(red\|black\|supermassive\)\c

If you want to highlight your hits as well, you can tweak it 
like


:g/red\|relativistic/s!!&!g|?^\s*astro-ph...

which, given that you seem to want to HTMLize your results 
(as hinted at below), will bold each hit.



What would be the command line with vim? (or are there other possibilities?)


While you could hack all that into a command line, it might 
be easier to put those lines in a script, say "foo.vim", and 
then just source that script on the command line:


   cat input.txt | vim -s foo.vim -


I would also like how to reemplace the

astro-ph/0604565 with http://xxx.lanl.gov/pdf/astro-ph/0604565

for all numbers, not only for 0604565 ...


after the ":1d" (that's "one dee", not "ell dee") line, you 
could put something like


:%s!^\s*astro-ph/\(\d\+\)!href="http://xxx.lanl.gov/pdf/astro-ph/\1";>&


(all on one line in case my mailer bungs it).  Your HTML was 
a little funky there, so I made some assumptions and cleaned 
it up a little:  The "\1" in the replacement is the number, 
and the "&" in the replacement is the whole original text 
(the "astro-ph:###" bit), so you'll have an HTML link 
with the original text as the clickable bit.


I'm sorry I couldn't come up with a clean way to snag just 
the unique paragraphs easily without having an instance show 
up as its own result-block.


Anyways, it's at least one sorta-solution to what you describe.

-tim