Xpath looks pretty sweet, anyone used it for simillar data sizes?

On 06/01/06, ryanm <[EMAIL PROTECTED]> wrote:
> > I find myself in a situation where I need to build a tool to analyse
> > lots of xml data. Thousands of records containing a lot of strings as
> > well as numericals.
> >
>     When I found myself in this situation I did 2 things:
>
> 1. Don't use XML, it is way too heavy for this much data. I found that by
> using a double-delimeted or fixed-width data format, the file size was
> reduced by as much as 70%. In the end, I went with fixed width because I
> could parse it faster (by avoiding calling split() thousands of times).
>
>     Now, I still used the XML object, but instead of letting it parse the
> file, I overwrote the onData event and used my own parsing function, which
> generated objects directly instead of parsing it out to an XML object.
> Essentially, the XML object just read the data in and dumped it to my
> parsing function.
>
> 2. Don't try to parse it all at once. What I did was dump it all into a
> buffer when it was loaded, and then fire off a parsing function that parsed
> 250 records per frame. I found that number through trial and error, you can
> find your own balance. The important thing was, the application didn't stop
> functioning while the records were being parsed, you could go to other areas
> of the app and use it normally, and when you went to the section that
> required the data, you got a progress bar showing how many records had been
> parsed.
>
>     My parsing function was semi-complicated. It took the whole dataset in
> as a string and split it on my record delimiter, and this array became my
> buffer. This way I knew how many records there were to parse, and
> approximately how long it would take to parse them. It then sliced 250
> records off the top of the buffer on every frame and passed them to the
> serialization function, which took them, serialized them, and inserted them
> into my "database" object. My parsing function also built several indexes
> while it was parsing the records, to make lookups faster once the database
> was ready. My application was a database of hotels, which were sortable by a
> number of criteria, so the parsing routine looked for those attributes of
> each hotel as it parsed, and when it saw a new value for one of those
> criteria, it made a new entry in the appropriate index for it.
>
>     I made very heavy use of the object collection syntax, for example:
>
> Index["Location"]["USA"]["Texas"]["Dallas"]
>
>     ...referred to an array of hotel ids which were in Dallas, Texas, USA,
> which could be used to find a hotel like this:
>
> // 0 is the first index in the array of ids
> hotelID = Index["Location"]["USA"]["Texas"]["Dallas"][0];
> return(Database[hotelID]);
>
>     In the end, it took about 5 times as much code to import, parse, and
> index the database than the whole rest of the application, but it worked, it
> was relatively fast, and it met the requirements I was given. I would've
> preferred for it to work from a web server, selecting what I needed from the
> database, but the client required that it work offline from a database that
> shipped with the cd, as well as be able to download an updated database from
> their website, and this was the best solution I could find in Flash that
> worked on both PC and Mac (no 3rd party wrappers). Unfortunately it had to
> parse the whole database every time you ran the app, but it would get the
> newest version from the web if you were online and it gave you the option to
> store it (in an ungodly-sized shared object) if you wanted to.
>
>     Anyway, that's how I did it, whether or not it was successful is a
> matter of opinion. ;-)
>
> ryanm
>
> _______________________________________________
> Flashcoders mailing list
> Flashcoders@chattyfig.figleaf.com
> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
>


--
Jonathan Clarke
1976 Ltd
http://19seventysix.co.uk
e: [EMAIL PROTECTED]
m (UK): +44 773 646 1954
m (Barbados): +1246 259 9475
_______________________________________________
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

Reply via email to