You may also try the api/expat addon. Not necessarily faster because
processing box in J is inherently slow.  You may have to amend
those addons to replace box by something else and specific to
your applications.

Пт, 09 авг 2013, Dan Farmer писал(а):
> Hi again,
> 
> So I am trying to parse some large (2-9 GB) XML files for an idea I
> had for using JDB. My plan was to use XSLT to flatten these things out
> (they are deeply nested structures), but figured I'd do a quick and
> easy test to make sure I had a reasonable grip on J's facilities
> before diving in.
> 
> Unfortunately with the code I came up with it is so slow that I don't
> think it's even worth attempting, can anyone provide some tips on how
> I can maybe speed this up? I read up on the J performance monitor and
> clocked it, it said 75% of the time was spent in cdcallback (which
> makes me think there's nothing I can do short of writing something in
> C/C++, but maybe I'm wrong). Here's the code (loosely adapted from
> Oleg & John Baker's examples for sax).
> 
> For the record, I created two smaller test files (500KB and 6MB) and
> the code below works correctly on both of those. I've also written
> Python code using lxml's element tree module and it can process the 2
> GB file in about 60 seconds, I let this code run for 30 minutes and
> then killed it.
> 
> Any ideas?
> 
> Thanks,
> Dan
> 
> require 'jmf'
> require 'files dir'
> require 'xml/sax'
> 
> saxclass ‘xp’
> 
> startDocument=: 3 : 0
> ids=: ''
> )
> 
> 
> startElement=: 4 : 0
> if. y-:,’Node’ do.
>   ids=: ids,< x getAttribute '_Id'
> end.
> )
> 
> 
> endDocument=: 3 : 0
> s: ids
> )
> 
> NB. =========================================================
> cocurrent 'base'
> 
> fn=: 'c:/data/test/2GBfile.xml’'
> 
> unmap_jmf_ 'xfile' NB. Hokey, but for debugging
> 
> JCHAR map_jmf_ 'xfile';fn
> 
> process_xp_ xfile
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm

-- 
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to