Hi Keith,

you wrote:
>hi elan
>well i've looked at the parse section and can't quite figure it out...
>how do you nominate that you want the parse function to take place on a 
>text file say "aeros.txt" rather than a string?

parse read %aeros.txt rules some-parse-rule

>and how do you specify it's location?

%aeros.txt ;- file aeros.txt is located in current directory
%/c/windows/temp/aeros.txt 
     ;- aeros.txt is located on drive C: in directory 
     ;- \Windows\Temp\

>and how would you strip off just what's after the colon and discard what is 
>before the colon?

It depends on which version of REBOL you are using. I'll go with REBOL/Core
here. 

I am assuming that newline is your delimeter for a record. The
CMLISTPTRACKS: entry is wrapped over several lines in my email client. But
for this item I also assume that the original data consists of one long line.

The following script creates a parse rule, reads the aero.txt file and
applies the rule to the file. 

It generates the following output in the REBOL console: (This output is
tripple wrapped: by the REBOL console, by my email client, by your email
client. The actual string contains no newlines.)

>> do %parse-question.r
Script: "Untitled" (none)
 INFINITE POSSIBILITIES  AMEL LARRIEUX   4948792     9399700067507   C9921
 21  SMA     19/06/20
00   GET UP,I N I,SWEET MISERY,SEARCHIN' FOR MY SOUL,EVEN IF,INFINITE
POSSIBILITES,SHINE,DOWN,WEA
THER,MAKE ME WHOLE,GET UP (THREAD HAD FUN MAIN MI,GET UP (MV MIG MIX
(RADIO))    15.91   Compact
Disc


Here's the script:

REBOL [
  file: %parse-question.r
]

result: {}

words: [
  "CPTITLE:"
  "CPARTIST:"
  "CMCATNUMBER:"
  "CMAPN:"
  "CMARIAPRICECODE:"
  "CMARIAMEDIACODE:"
  "CMARIADISTRIBUTORCODE:"
  "DMRELEASE:"
  "CMLISTPTRACKS:"
  "CURMPRICE:"
  "CMTYPE:"
]

parse-sub-rule: []
foreach word words [
  insert tail parse-sub-rule :word
  insert tail parse-sub-rule [copy info to newline (append result join info
tab)|
  ]
]
insert tail parse-sub-rule [
  skip
]  

parse-rule: compose/deep [some [(parse-sub-rule)]]

parse-file: read %aero.txt

parse parse-file parse-rule

print result

>when you parse files, can it be done on a folder of files?

foreach parse file load %/c/folder/of/files/ [
  clear result
  parse parse-file parse-rule
  print result
]

>i have a stack of html files that i need to extract all text in a 
>particular section and then change the table cells to tabs etc but i have a 
>few to do and would like to run the script over the whole lot at once if 
>possible.

This sounds like a different task?

>i can see no reference to how you parse files locally..there are only 
>examples of web addresses or specifying a string.

See the examples above.

>
>thanks for your help
>keith
>
>> >hi all
>> >i am new to scripting (rebol and python) and was wondering if it was
>> >capable of doing to the following and if so..how?
>> >the following is an example of what i need to strip and turn into a tab
>> >delimited file..
>> >the entries are from a music database that is updated weekly..it is in
>> >lotus notes so i export from there and it dumps this file out with the new
>> >titles for the coming week (here the 19th June)....
>> >i only need to get the rows with the double astericks at the start (i put
>> >those in for this post, they are not there in the normal file) and then
>> >strip the words at the start and then put a tab in between them so i can
>> >bring it into excel for the sales team to look at .... i would need a
first
>> >row to describe each column also maybe something like
>> >Title tab Artist tab Cat no tab APN etc etc
>> >the tracklisting is a little more complicated as it has multiple tracks
>> >within the row..they are seperated by commas....
>> >i think that it won't be that hard but i had no success in perl so i
>> >thought i might try rebol or python...
>> >
>> >thanks in advance
>> >keith
>> >
>> >FORM: Popular Recording
>> >CMTYPESWITCH: Popular
>> >RELOCORIGINAL: 55864A2603AC94ACCA2568F200224388
>> >CMCOUNTRY_ORIGIN:
>> >CMGENRE_CODE:
>> >**CPTITLE: INFINITE POSSIBILITIES
>> >**CPARTIST: AMEL LARRIEUX
>> >CPOARTISTS:
>> >**CMCATNUMBER: 4948792
>> >**CMAPN: 9399700067507
>> >**CMARIAPRICECODE: C9921
>> >**CMARIAMEDIACODE: 21
>> >**CMARIADISTRIBUTORCODE: SMA
>> >CMARIAPACKAGECODE:
>> >**DMRELEASE: 19/06/2000
>> >cMDateFormat: 19/06/2000 12:00:00 AM
>> >DMDELETE:
>> >cMDeleteStatus: No
>> >**CMLISTPTRACKS: GET UP,I N I,SWEET MISERY,SEARCHIN' FOR MY SOUL,EVEN
>> >IF,INFINITE POSSIBILITES,SHINE,DOWN,WEATHER,MAKE ME WHOLE,GET UP (THREAD
>> >HAD FUN MAIN MI,GET UP (MV MIG MIX (RADIO))
>> >CMLISTPARTISTS: ,,,,,,,,,,,
>> >RELOCMEDIA: F73C80183A01A8D7CA2568F2002A5517
>> >**CURMPRICE: 15.91
>> >**CMTYPE: Compact Disc
>> >CMRECORDCOMPANY: SONY MUSIC
>> >CMPACKAGE:
>> >CMARIADISTRIBUTORHOUSECODE:
>> >cDistributorHouse:
>> >CMLOCALE: Y
>> >$UpdatedBy: CN=PPT/OU=AEROS/O=JUKEBOX
>> >$Revisions: 02/06/2000 05:42:22 PM,02/06/2000 05:42:22 PM
>> >
>> >
>> >
>>
>>;- Elan [ : - ) ]
>
>
>

;- Elan [ : - ) ]

Reply via email to