On Fri, Oct 23, 2009 at 6:39 AM, Gonen Shoham <gone...@sapiens.com> wrote:
> I am not a PIPE expert....
>
> The criteria is actually delete lines where word(1) = 'XX' and word(3) =
> 'YY' and substring(25,1) = 'a'   etc....

We probably should have such discussions on CMSPIP-L instead...

The "etc" makes cheating very hard ;-)   The generic solution really
is multi-stream pipelines. The idea there is that your pipeline spits
the records in two groups based on the first selection, and subsequent
stages split the matched records further in two groups, etc. But you
need to collect the unmatched records in each phase and pass them all
back to the main path (since you did not want to delete them). So you
get a network of pipelines that connect at the beginning and end of
the process. Have a look at Melinda's first 2 papers on the CMS
Pipelines Homepage.

  < input file a
  | a1: pick w1 ^== ,XX,
  | y: faninany
  | > output file a
  \ a1:
  | a2: pick w3 ^== ,YY,
  | y:
  \ a2:
  | a3: pick 25.1 ^== ,a,
  | y:

I have reversed the condition (select the records that do NOT match)
because it keeps the pipeline a bit more straight. And because you
only had *and* in your selection, the pattern is rather obvious.

Sir Rob the Plumber

PS Very lazy plumbers would (when it is just a one-time effort) simply
repeat the same process a few times. Run the pipeline once to skip the
XX records, another time to skip the YY, etc. And accept that you read
and write the file a few times.

Reply via email to