Re: searching an XML doc

2008-01-16 Thread grflanagan
On Jan 15, 9:33 pm, Gowri [EMAIL PROTECTED] wrote:
 Hello,

 I've been reading about ElementTreee and ElementPath so I could use
 them to find the right elements in the DOM. Unfortunately neither of
 these seem to offer XPath like capabilities where I can find elements
 based on tag, attribute values etc. Are there any libraries which can
 give me XPath like functionality?

 Thanks in advance

Create your query like:

ns0 = '{http://a.b.com/phedex}'

query = '%srequest/%sstatus' % (ns0, ns0)

Also, although imperfect, some people have found this useful:

http://gflanagan.net/site/python/utils/elementfilter/elementfilter.py.txt

[CODE]

test = '''phedexData xmlns=http://a.b.com/phedex;
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
xsi:schemaLocation=http://a.b.com/phedex requests.xsd
!--  Low priority replication request --
request id=1234 last_update=1060199000.0
status
approvedT1_RAL_MSS/approved
approvedT2_London_ICHEP/approved
disapprovedT2_Southgrid_Bristol/
disapproved
pending/
move_pending/
/status
subscription open=1 priority=0 type=replicate
items
dataset/PrimaryDS1/ProcessedDS1/
Tier/dataset
block/PrimaryDS2/
ProcessedDS2/Tier/block/block
/items
/subscription
/request
/phedexData
'''

from xml.etree import ElementTree as ET

root = ET.fromstring(test)

ns0 = '{http://a.b.com/phedex}'

from rattlebag.elementfilter import findall, data

#http://gflanagan.net/site/python/utils/elementfilter/
elementfilter.py.txt

query0 = '%(ns)srequest/%(ns)sstatus' % {'ns': ns0}
query1 = '%(ns)srequest/%(ns)[EMAIL PROTECTED]replicate]/%
(ns)sitems' % {'ns': ns0}
query2 = '%(ns)[EMAIL PROTECTED]/%(ns)sstatus/%(ns)sapproved' %
{'ns': ns0}

print 'With ElementPath: '
print root.findall(query0)
print
print 'With ElementFilter:'
for query in [query0, query1, query2]:
print
print '+'*50
print 'query: ', query
print
for item in findall(root, query):
print 'item: ', item
print 'xml:'
ET.dump(item)

print '-'*50
print
print 'approved: ', data(root, query2)

[/CODE]

[OUTPUT]
With ElementPath:
[Element {http://a.b.com/phedex}status at b95ad0]

With ElementFilter:

++
query:  {http://a.b.com/phedex}request/{http://a.b.com/phedex}status

item:  Element {http://a.b.com/phedex}status at b95ad0
xml:
ns0:status xmlns:ns0=http://a.b.com/phedex;
ns0:approvedT1_RAL_MSS/ns0:approved
ns0:approvedT2_London_ICHEP/ns0:approved
ns0:disapprovedT2_Southgrid_Bristol/
ns0:disapproved
ns0:pending /
ns0:move_pending /
/ns0:status


++
query:  {http://a.b.com/phedex}request/{http://a.b.com/
[EMAIL PROTECTED]
==replicate]/{http://a.b.com/phedex}items

item:  Element {http://a.b.com/phedex}items at b95eb8
xml:
ns0:items xmlns:ns0=http://a.b.com/phedex;
ns0:dataset/PrimaryDS1/ProcessedDS1/
Tier/ns0:
dataset
ns0:block/PrimaryDS2/
ProcessedDS2/Tier
/block/ns0:block
/ns0:items


++
query:  {http://a.b.com/[EMAIL PROTECTED]/{http://a.b.com/
phedex}status/
{http://a.b.com/phedex}approved

item:  Element {http://a.b.com/phedex}approved at b95cd8
xml:
ns0:approved xmlns:ns0=http://a.b.com/phedex;T1_RAL_MSS/
ns0:approved

item:  Element {http://a.b.com/phedex}approved at b95cb0
xml:
ns0:approved xmlns:ns0=http://a.b.com/phedex;T2_London_ICHEP/
ns0:approved

--

approved:  ['T1_RAL_MSS', 'T2_London_ICHEP']
INFO End logging.
[/OUTPUT]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: searching an XML doc

2008-01-16 Thread Gowri
Hi Gerard,

I don't know what to say :) thank you so much for taking time to post
all of this. truly appreciate it :)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: searching an XML doc

2008-01-16 Thread Stefan Behnel
grflanagan wrote:
 On Jan 15, 9:33 pm, Gowri [EMAIL PROTECTED] wrote:
 I've been reading about ElementTreee and ElementPath so I could use
 them to find the right elements in the DOM. Unfortunately neither of
 these seem to offer XPath like capabilities where I can find elements
 based on tag, attribute values etc. Are there any libraries which can
 give me XPath like functionality?
 
 Create your query like:
 
 ns0 = '{http://a.b.com/phedex}'
 
 query = '%srequest/%sstatus' % (ns0, ns0)

lxml supports the same thing, BTW, and how to work with namespaces is
explained in the tutorial:

http://codespeak.net/lxml/dev/tutorial.html#namespaces

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


searching an XML doc

2008-01-15 Thread Gowri
Hello,

I've been reading about ElementTreee and ElementPath so I could use
them to find the right elements in the DOM. Unfortunately neither of
these seem to offer XPath like capabilities where I can find elements
based on tag, attribute values etc. Are there any libraries which can
give me XPath like functionality?

Thanks in advance
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: searching an XML doc

2008-01-15 Thread Diez B. Roggisch
Gowri schrieb:
 Hello,
 
 I've been reading about ElementTreee and ElementPath so I could use
 them to find the right elements in the DOM. Unfortunately neither of
 these seem to offer XPath like capabilities where I can find elements
 based on tag, attribute values etc. Are there any libraries which can
 give me XPath like functionality?


lxml does that.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: searching an XML doc

2008-01-15 Thread Gowri
On Jan 15, 3:49 pm, Diez B. Roggisch [EMAIL PROTECTED] wrote:
 Gowri schrieb:

  Hello,

  I've been reading about ElementTreee and ElementPath so I could use
  them to find the right elements in the DOM. Unfortunately neither of
  these seem to offer XPath like capabilities where I can find elements
  based on tag, attribute values etc. Are there any libraries which can
  give me XPath like functionality?

 lxml does that.

 Diez

Hi Diez

I was trying lxml out and was unable to find any examples that would
help me parse an XML file with namespaces. For example, my XML file
looks like this:

phedexData xmlns=http://a.b.com/phedex;
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
xsi:schemaLocation=http://a.b.com/phedex requests.xsd
!--  Low priority replication request --
request id=1234 last_update=1060199000.0
status
approvedT1_RAL_MSS/approved
approvedT2_London_ICHEP/approved
disapprovedT2_Southgrid_Bristol/disapproved
pending/
move_pending/
/status
subscription open=1 priority=0 type=replicate
items
dataset/PrimaryDS1/ProcessedDS1/Tier/dataset

block/PrimaryDS2/ProcessedDS2/Tier/block/block
/items
/subscription
/request
/phedexData

If my Xpath query is //request, it obviously would not work. Is there
some sort of namespace registration etc. that is to be done before
issuing a query? Example code would help a lot.


-- 
http://mail.python.org/mailman/listinfo/python-list