Hi Keith,
you wrote:
>hi elan
>well i've looked at the parse section and can't quite figure it out...
>how do you nominate that you want the parse function to take place on a
>text file say "aeros.txt" rather than a string?
parse read %aeros.txt rules some-parse-rule
>and how do you specify it's location?
%aeros.txt ;- file aeros.txt is located in current directory
%/c/windows/temp/aeros.txt
;- aeros.txt is located on drive C: in directory
;- \Windows\Temp\
>and how would you strip off just what's after the colon and discard what is
>before the colon?
It depends on which version of REBOL you are using. I'll go with REBOL/Core
here.
I am assuming that newline is your delimeter for a record. The
CMLISTPTRACKS: entry is wrapped over several lines in my email client. But
for this item I also assume that the original data consists of one long line.
The following script creates a parse rule, reads the aero.txt file and
applies the rule to the file.
It generates the following output in the REBOL console: (This output is
tripple wrapped: by the REBOL console, by my email client, by your email
client. The actual string contains no newlines.)
>> do %parse-question.r
Script: "Untitled" (none)
INFINITE POSSIBILITIES AMEL LARRIEUX 4948792 9399700067507 C9921
21 SMA 19/06/20
00 GET UP,I N I,SWEET MISERY,SEARCHIN' FOR MY SOUL,EVEN IF,INFINITE
POSSIBILITES,SHINE,DOWN,WEA
THER,MAKE ME WHOLE,GET UP (THREAD HAD FUN MAIN MI,GET UP (MV MIG MIX
(RADIO))15.91 Compact
Disc
Here's the script:
REBOL [
file: %parse-question.r
]
result: {}
words: [
"CPTITLE:"
"CPARTIST:"
"CMCATNUMBER:"
"CMAPN:"
"CMARIAPRICECODE:"
"CMARIAMEDIACODE:"
"CMARIADISTRIBUTORCODE:"
"DMRELEASE:"
"CMLISTPTRACKS:"
"CURMPRICE:"
"CMTYPE:"
]
parse-sub-rule: []
foreach word words [
insert tail parse-sub-rule :word
insert tail parse-sub-rule [copy info to newline (append result join info
tab)|
]
]
insert tail parse-sub-rule [
skip
]
parse-rule: compose/deep [some [(parse-sub-rule)]]
parse-file: read %aero.txt
parse parse-file parse-rule
print result
>when you parse files, can it be done on a folder of files?
foreach parse file load %/c/folder/of/files/ [
clear result
parse parse-file parse-rule
print result
]
>i have a stack of html files that i need to extract all text in a
>particular section and then change the table cells to tabs etc but i have a
>few to do and would like to run the script over the whole lot at once if
>possible.
This sounds like a different task?
>i can see no reference to how you parse files locally..there are only
>examples of web addresses or specifying a string.
See the examples above.
>
>thanks for your help
>keith
>
>> >hi all
>> >i am new to scripting (rebol and python) and was wondering if it was
>> >capable of doing to the following and if so..how?
>> >the following is an example of what i need to strip and turn into a tab
>> >delimited file..
>> >the entries are from a music database that is updated weekly..it is in
>> >lotus notes so i export from there and it dumps this file out with the new
>> >titles for the coming week (here the 19th June)
>> >i only need to get the rows with the double astericks at the start (i put
>> >those in for this post, they are not there in the normal file) and then
>> >strip the words at the start and then put a tab in between them so i can
>> >bring it into excel for the sales team to look at i would need a
first
>> >row to describe each column also maybe something like
>> >Title tab Artist tab Cat no tab APN etc etc
>> >the tracklisting is a little more complicated as it has multiple tracks
>> >within the row..they are seperated by commas
>> >i think that it won't be that hard but i had no success in perl so i
>> >thought i might try rebol or python...
>> >
>> >thanks in advance
>> >keith
>> >
>> >FORM: Popular Recording
>> >CMTYPESWITCH: Popular
>> >RELOCORIGINAL: 55864A2603AC94ACCA2568F200224388
>> >CMCOUNTRY_ORIGIN:
>> >CMGENRE_CODE:
>> >**CPTITLE: INFINITE POSSIBILITIES
>> >**CPARTIST: AMEL LARRIEUX
>> >CPOARTISTS:
>> >**CMCATNUMBER: 4948792
>> >**CMAPN: 9399700067507
>> >**CMARIAPRICECODE: C9921
>> >**CMARIAMEDIACODE: 21
>> >**CMARIADISTRIBUTORCODE: SMA
>> >CMARIAPACKAGECODE:
>> >**DMRELEASE: 19/06/2000
>> >cMDateFormat: 19/06/2000 12:00:00 AM
>> >DMDELETE:
>> >cMDeleteStatus: No
>> >**CMLISTPTRACKS: GET UP,I N I,SWEET MISERY,SEARCHIN' FOR MY SOUL,EVEN
>> >IF,INFINITE POSSIBILITES,SHINE,DOWN,WEATHER,MAKE ME WHOLE,GET UP (THREAD
>> >HAD FUN MAIN MI,GET UP (MV MIG MIX (RADIO))
>> >CMLISTPARTISTS: ,,,
>> >RELOCMEDIA: F73C80183A01A8D7CA2568F2002A5517
>> >**CURMPRICE: 15.91
>> >**CMTYPE: Compact Disc
>> >CMRECORDCOMPANY: SONY MUSIC
>> >CMPACKAGE:
>> >CMARIADISTRIBUTORHOUSECODE:
>> >cDistributorHouse:
>> >CMLOCALE: Y
>> >$UpdatedBy: CN=PPT/OU=AEROS/O=JUKEBOX
>> >$Revisions: 02/06/2000 05:42:22 PM,02/06/2000 05:42:22 PM
>> >
>> >
>> >
>>
>>;- Elan [ : - ) ]
>
>
>
;- Elan [