Re: Error in Tree Structure

2016-02-29 Thread subhabangalore
On Saturday, February 27, 2016 at 9:43:56 PM UTC+5:30, Rustom Mody wrote:
> On Saturday, February 27, 2016 at 2:47:53 PM UTC+5:30, subhaba...@gmail.com 
> wrote:
> > I was trying to implement the code, 
> > 
> > import nltk
> > import nltk.tag, nltk.chunk, itertools
> > def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
> > words, ents = zip(*tree.pos())
> > iobs = []
> > prev = None
> > for ent in ents:
> > if ent == tree.node:
> > iobs.append('O')
> > prev = None
> > elif prev == ent:
> >  iobs.append('I-%s' % ent)
> > else:
> >  iobs.append('B-%s' % ent)
> >  prev = ent
> > words, tags = zip(*tag(words))
> > return itertools.izip(words, tags, iobs)
> > 
> > def ieer_chunked_sents(tag=nltk.tag.pos_tag):
> > for doc in ieer.parsed_docs():
> > tagged = ieertree2conlltags(doc.text, tag)
> > yield nltk.chunk.conlltags2tree(tagged)
> > 
> > 
> > from chunkers import ieer_chunked_sents, ClassifierChunker
> > from nltk.corpus import treebank_chunk
> > ieer_chunks = list(ieer_chunked_sents())
> > chunker = ClassifierChunker(ieer_chunks[:80])
> > print chunker.parse(treebank_chunk.tagged_sents()[0])
> > score = chunker.evaluate(ieer_chunks[80:])
> > print score.accuracy()
> > 
> > It is running fine. 
> > But as I am trying to rewrite the code as,
> > chunker = ClassifierChunker(list1),
> > where list1 is same value as,
> > ieer_chunks[:80]
> > only I am pasting the value as 
> > [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), 
> > Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', 
> > [(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', 
> > [(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), 
> > (u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'), 
> > (u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE', [(u'Saturday', 
> > 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), (u'hikes', 'NNS'), 
> > (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), (u'cash-strapped', 
> > 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 'PRP'), 
> > (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), 
> > (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 
> > 'NN'),(u'(cm-kjd)', 'NN')])]
> > the value of whole list directly I am getting syntax error.
> 
> Dunno how literally you intend this but there is a "" near the end 
> of the list. Intended?

It is intended. As actual list was large.
And most likely I could solve the problem,
with 
from nltk.tree import Tree
I missed in my code.

Thank you for your kind time and discussion.

Regards,
RP
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Error in Tree Structure

2016-02-27 Thread Rustom Mody
On Saturday, February 27, 2016 at 2:47:53 PM UTC+5:30, subhaba...@gmail.com 
wrote:
> I was trying to implement the code, 
> 
> import nltk
> import nltk.tag, nltk.chunk, itertools
> def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
> words, ents = zip(*tree.pos())
> iobs = []
> prev = None
> for ent in ents:
> if ent == tree.node:
> iobs.append('O')
> prev = None
> elif prev == ent:
>  iobs.append('I-%s' % ent)
> else:
>  iobs.append('B-%s' % ent)
>  prev = ent
> words, tags = zip(*tag(words))
> return itertools.izip(words, tags, iobs)
> 
> def ieer_chunked_sents(tag=nltk.tag.pos_tag):
> for doc in ieer.parsed_docs():
> tagged = ieertree2conlltags(doc.text, tag)
> yield nltk.chunk.conlltags2tree(tagged)
> 
> 
> from chunkers import ieer_chunked_sents, ClassifierChunker
> from nltk.corpus import treebank_chunk
> ieer_chunks = list(ieer_chunked_sents())
> chunker = ClassifierChunker(ieer_chunks[:80])
> print chunker.parse(treebank_chunk.tagged_sents()[0])
> score = chunker.evaluate(ieer_chunks[80:])
> print score.accuracy()
> 
> It is running fine. 
> But as I am trying to rewrite the code as,
> chunker = ClassifierChunker(list1),
> where list1 is same value as,
> ieer_chunks[:80]
> only I am pasting the value as 
> [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), 
> Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', 
> [(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', 
> [(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), (u'students', 
> 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'), (u'politicians', 'NNS'), 
> (u'on', 'IN'), Tree('DATE', [(u'Saturday', 'NNP')]), (u'protested', 'VBD'), 
> (u'tax', 'NN'), (u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), 
> (u'their', 'PRP$'), (u'cash-strapped', 'JJ'), (u'government,', 'NN'), 
> (u'which', 'WDT'), (u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), 
> (u'failing', 'VBG'), (u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), 
> (u'services.', 'NN'),(u'(cm-kjd)', 'NN')])]
> the value of whole list directly I am getting syntax error.

Dunno how literally you intend this but there is a "" near the end 
of the list. Intended?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Error in Tree Structure

2016-02-27 Thread Steven D'Aprano
On Sat, 27 Feb 2016 08:17 pm, subhabangal...@gmail.com wrote:

> Is it any error in Python part or in NLTK part?

Neither.

Any time you think there is an error in Python, it is 99.9% sure that the
error is in your code, not Python.

If the error is a SyntaxError, that is 99.9%.

> If any one may guide me what is the error I am doing and how may I solve
> it.

Look at the SyntaxError traceback and read what it says. Does it tell you
what the error is? Does it use a ^ as an arrow to point to the error, or
immediately after the error?

Chances are, the error is that you have added or deleted a bracket or
parenthesis somewhere. Dealing with nested lists like this is usually
awful, because they are unreadable. Can you avoid copying and pasting the
nested list?

If not, try re-formatting it so you can at least read it:

# Unreadable, bad:
thelist = [Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','),
Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION',
[(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL',
[(u'Thousands', 'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'),
(u'students', 'NNS'), (u'and', 'CC'), (u'opposition', 'NN'),
(u'politicians', 'NNS'), (u'on', 'IN'), Tree('DATE',
[(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'),
(u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'),
(u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'),
(u'they', 'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'),
(u'to', 'TO'), (u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),
(u'(cm-kjd)', 'NN')])]


# Slightly more readable, good:
thelist = [
  Tree('S', [
Tree(
'LOCATION', [
(u'NAIROBI', 'NNP')
]
), 
(u',', ','), 
Tree(
'LOCATION', [
(u'Kenya', 'NNP')
]
), 
(u'(', '('), 
Tree(
'ORGANIZATION', [
(u'AP', 'NNP')
]
   ),
(u')', ')'), 
(u'_', 'NNP'), 
Tree(
'CARDINAL', [
(u'Thousands', 'NNP')
]
), 
(u'of', 'IN'), 
(u'laborers,', 'JJ'), 
(u'students', 'NNS'), 
(u'and', 'CC'), 
(u'opposition', 'NN'), 
(u'politicians', 'NNS'), 
(u'on', 'IN'), 
Tree(
'DATE', [
(u'Saturday', 'NNP')
]
), 
(u'protested', 'VBD'), 
(u'tax', 'NN'), 
(u'hikes', 'NNS'), 
(u'imposed', 'VBN'), 
(u'by', 'IN'), 
(u'their', 'PRP$'), 
(u'cash-strapped', 'JJ'), 
(u'government,', 'NN'), 
(u'which', 'WDT'), 
(u'they', 'PRP'), 
(u'accused', 'VBD'), 
(u'of', 'IN'), 
(u'failing', 'VBG'), 
(u'to', 'TO'), 
(u'provide', 'VB'), 
(u'basic', 'JJ'), 
(u'services.', 'NN'), 
(u'(cm-kjd)', 'NN')
]
  )   
  ]



If you format your giant list so you can see the structure, then you will
likely find where there is a problem. Perhaps you have:

- missing or too many [ ( ) or ]

- a missing or extra comma

- a missing or extra quotation marks

- unexpected symbols like  inside the list




-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Error in Tree Structure

2016-02-27 Thread subhabangalore
I was trying to implement the code, 

import nltk
import nltk.tag, nltk.chunk, itertools
def ieertree2conlltags(tree, tag=nltk.tag.pos_tag):
words, ents = zip(*tree.pos())
iobs = []
prev = None
for ent in ents:
if ent == tree.node:
iobs.append('O')
prev = None
elif prev == ent:
 iobs.append('I-%s' % ent)
else:
 iobs.append('B-%s' % ent)
 prev = ent
words, tags = zip(*tag(words))
return itertools.izip(words, tags, iobs)

def ieer_chunked_sents(tag=nltk.tag.pos_tag):
for doc in ieer.parsed_docs():
tagged = ieertree2conlltags(doc.text, tag)
yield nltk.chunk.conlltags2tree(tagged)


from chunkers import ieer_chunked_sents, ClassifierChunker
from nltk.corpus import treebank_chunk
ieer_chunks = list(ieer_chunked_sents())
chunker = ClassifierChunker(ieer_chunks[:80])
print chunker.parse(treebank_chunk.tagged_sents()[0])
score = chunker.evaluate(ieer_chunks[80:])
print score.accuracy()

It is running fine. 
But as I am trying to rewrite the code as,
chunker = ClassifierChunker(list1),
where list1 is same value as,
ieer_chunks[:80]
only I am pasting the value as 
[Tree('S', [Tree('LOCATION', [(u'NAIROBI', 'NNP')]), (u',', ','), 
Tree('LOCATION', [(u'Kenya', 'NNP')]), (u'(', '('), Tree('ORGANIZATION', 
[(u'AP', 'NNP')]), (u')', ')'), (u'_', 'NNP'), Tree('CARDINAL', [(u'Thousands', 
'NNP')]), (u'of', 'IN'), (u'laborers,', 'JJ'), (u'students', 'NNS'), (u'and', 
'CC'), (u'opposition', 'NN'), (u'politicians', 'NNS'), (u'on', 'IN'), 
Tree('DATE', [(u'Saturday', 'NNP')]), (u'protested', 'VBD'), (u'tax', 'NN'), 
(u'hikes', 'NNS'), (u'imposed', 'VBN'), (u'by', 'IN'), (u'their', 'PRP$'), 
(u'cash-strapped', 'JJ'), (u'government,', 'NN'), (u'which', 'WDT'), (u'they', 
'PRP'), (u'accused', 'VBD'), (u'of', 'IN'), (u'failing', 'VBG'), (u'to', 'TO'), 
(u'provide', 'VB'), (u'basic', 'JJ'), (u'services.', 'NN'),(u'(cm-kjd)', 
'NN')])]
the value of whole list directly I am getting syntax error.
I tried to paste it in Python IDE outside code there also it is giving syntax 
error. 
If I do not paste the value and and rename ieer_chunks[:80] as list1 there is 
no error.
I may be doing some problem while copying the value and pasting it. 
But I did not change anything there.

Is it any error in Python part or in NLTK part? 

Thanks in advance.
If any one may guide me what is the error I am doing and how may I solve it.


 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Tree structure

2011-07-26 Thread Bevan Jenkins
On Jul 26, 8:46 pm, Peter Otten <__pete...@web.de> wrote:
> Bevan Jenkins wrote:
> > Hello,
>
> > I am trying to create a tree structure for use with a PyQt QTreeView.
> > But first I need to get my head around how to create the tree
> > structure.  I have a dictionary (for testing purposes) but I will
> > later use a table via sqlalchemy.
>> SNIP<<
> > Any thoughts about how to acomplish this will be much appreciated,
> > Bevan
>
> If you turn the values into lists you can use the same function for both
> trees:
>
> INDENT = " " * 4
>
> def print_tree(lookup, node=None):
>     def _tree(node, level):
>         print "%s%s" % (INDENT * level, node)
>         for node in lookup.get(node, ()):
>             _tree(node, level+1)
>
>     if node is None:
>         for node in lookup:
>             _tree(node, 0)
>     else:
>         _tree(node, 0)
>
> def reversed_dict(tree):
>     reversed_tree = {}
>     for key, values in rivers.iteritems():
>         for value in values:
>             reversed_tree.setdefault(value, []).append(key)
>     return reversed_tree
>
> if __name__ == "__main__":
>     rivers = {
>         "little stream": "sea",
>         "mountain stream": "lake",
>         "lake": "big river",
>         "cold spring": "big river",
>         "big river": "sea",
>         "see": ""}
>
>     rivers = dict((k, [v]) for k, v in rivers.iteritems() if v)
>     print_tree(rivers)
>     print "---"
>     print_tree(reversed_dict(rivers), "sea")- Hide quoted text -
>
> - Show quoted text -

Peter,
Thank you that does what I need!  Now I just need to incorporate into
PyQt but that shouldn't be too hard...

Macro,
I need to look into lxml in the coming months, so I might revisit this
then.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure

2011-07-26 Thread Peter Otten
Bevan Jenkins wrote:

> Hello,
> 
> I am trying to create a tree structure for use with a PyQt QTreeView.
> But first I need to get my head around how to create the tree
> structure.  I have a dictionary (for testing purposes) but I will
> later use a table via sqlalchemy.
> 
> The use case is hydrology, so I  would like to have a hydrologically
> connected river tree, in which you can browse upstream from the sea
> (making choices) or downstream from any named hydrological feature.
> Each key flows into its value pair. myrivers =
> {"river":"flows_into"}.  An example is below:
> 
> myrivers = {"little stream":"sea",
> "mountain stream":"lake",
> "lake":"big river",
> "cold spring":"big river",
> "big river":"sea"
> "sea":""}
> 
> I would like the tree to look like (if the formatting works). so
> you can browse downstream from each named river but also upstream from
> the sea picking which direction to go.
> 
> little stream
> sea
> mountain stream
> lake
> big river
> sea
> lake
> big river
> sea
> cold spring
> big river
> sea
> big river
> sea
> sea
> little stream
> big river
> lake
> mountain stream
> cold spring
> 
> <>
> So every key is a parent.  For all keys that have a value (not ""),
> the value is the child and is then used as a parent to get the next
> child until the sea and a value of "" is reached.  For the sea this is
> reversed, that you find all rivers that flow into the sea and then all
> rivers that flow into them.
> 
> 
> Any thoughts about how to acomplish this will be much appreciated,
> Bevan

If you turn the values into lists you can use the same function for both 
trees:

INDENT = " " * 4

def print_tree(lookup, node=None):
def _tree(node, level):
print "%s%s" % (INDENT * level, node)
for node in lookup.get(node, ()):
_tree(node, level+1)

if node is None:
for node in lookup:
_tree(node, 0)
else:
_tree(node, 0)

def reversed_dict(tree):
reversed_tree = {}
for key, values in rivers.iteritems():
for value in values:
reversed_tree.setdefault(value, []).append(key)
return reversed_tree

if __name__ == "__main__":
rivers = {
"little stream": "sea",
"mountain stream": "lake",
"lake": "big river",
"cold spring": "big river",
"big river": "sea",
"see": ""}

rivers = dict((k, [v]) for k, v in rivers.iteritems() if v)
print_tree(rivers)
print "---"
print_tree(reversed_dict(rivers), "sea")


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure

2011-07-26 Thread Marco Nawijn
On Jul 26, 6:53 am, Bevan Jenkins  wrote:
> Hello,
>
> I am trying to create a tree structure for use with a PyQt QTreeView.
> But first I need to get my head around how to create the tree
> structure.  I have a dictionary (for testing purposes) but I will
> later use a table via sqlalchemy.
>
> The use case is hydrology, so I  would like to have a hydrologically
> connected river tree, in which you can browse upstream from the sea
> (making choices) or downstream from any named hydrological feature.
> Each key flows into its value pair. myrivers =
> {"river":"flows_into"}.  An example is below:
>
> myrivers = {"little stream":"sea",
>     "mountain stream":"lake",
>     "lake":"big river",
>     "cold spring":"big river",
>     "big river":"sea"
>     "sea":""}
>
> I would like the tree to look like (if the formatting works). so
> you can browse downstream from each named river but also upstream from
> the sea picking which direction to go.
>
> little stream
>     sea
> mountain stream
>     lake
>         big river
>             sea
> lake
>     big river
>         sea
> cold spring
>     big river
>         sea
> big river
>     sea
> sea
>     little stream
>     big river
>         lake
>             mountain stream
>         cold spring
>
> <>
> So every key is a parent.  For all keys that have a value (not ""),
> the value is the child and is then used as a parent to get the next
> child until the sea and a value of "" is reached.  For the sea this is
> reversed, that you find all rivers that flow into the sea and then all
> rivers that flow into them.
>
> Any thoughts about how to acomplish this will be much appreciated,
> Bevan

Hello Bevan,

Is it an option to use XML as an in-memory representation. It
naturally provides the
interface you need, like traversing from parent to children and back.
In addition, you
get querying capabilities like XPATH for free. In python I recommend
lxml.

Your task than is to fill the tree based on the information in the
database. The ORM functionality
from sqlalchemy will be of help here. In addition, you somehow have to
populate the QT tree with the
data from your in-memory XML representation. I have no experience with
QT so I cannot help you there.

Regards,

Marco
-- 
http://mail.python.org/mailman/listinfo/python-list


Tree structure

2011-07-25 Thread Bevan Jenkins
Hello,

I am trying to create a tree structure for use with a PyQt QTreeView.
But first I need to get my head around how to create the tree
structure.  I have a dictionary (for testing purposes) but I will
later use a table via sqlalchemy.

The use case is hydrology, so I  would like to have a hydrologically
connected river tree, in which you can browse upstream from the sea
(making choices) or downstream from any named hydrological feature.
Each key flows into its value pair. myrivers =
{"river":"flows_into"}.  An example is below:

myrivers = {"little stream":"sea",
"mountain stream":"lake",
"lake":"big river",
"cold spring":"big river",
"big river":"sea"
"sea":""}

I would like the tree to look like (if the formatting works). so
you can browse downstream from each named river but also upstream from
the sea picking which direction to go.

little stream
sea
mountain stream
lake
big river
sea
lake
big river
sea
cold spring
big river
sea
big river
sea
sea
little stream
big river
lake
mountain stream
cold spring

<>
So every key is a parent.  For all keys that have a value (not ""),
the value is the child and is then used as a parent to get the next
child until the sea and a value of "" is reached.  For the sea this is
reversed, that you find all rivers that flow into the sea and then all
rivers that flow into them.


Any thoughts about how to acomplish this will be much appreciated,
Bevan



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a standard name for this tree structure?

2010-04-06 Thread Lie Ryan
On 04/07/10 14:18, Steven D'Aprano wrote:
> I could have used None, or "root", or "this is a magic value that 
> probably won't clash with an entry in the tree", or -1 as a sentinel 
> instead, but they all risk accidental clashes with tree entries.

Especially when you want to consider the possibility of inserting the
data structure inside the data structure itself.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a standard name for this tree structure?

2010-04-06 Thread Steven D'Aprano
On Tue, 06 Apr 2010 19:16:05 +0200, egbert wrote:

> On Sun, Apr 04, 2010 at 12:10:02PM +, Steven D'Aprano wrote:
>  
>> I can implement this tree using a flat dict:
>> 
>> root = object()
>> data = {root: ['Mammal', 'Reptile'],
> 
> What is the advantage, or thougth behind, using an instance of object as
> the root of your flat tree ? egbert

It's a known sentinel that can't possibly clash with an entry in the tree 
except by deliberate action on the part of the caller.

I could have used None, or "root", or "this is a magic value that 
probably won't clash with an entry in the tree", or -1 as a sentinel 
instead, but they all risk accidental clashes with tree entries.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a standard name for this tree structure?

2010-04-06 Thread egbert
On Sun, Apr 04, 2010 at 12:10:02PM +, Steven D'Aprano wrote:
 
> I can implement this tree using a flat dict:
> 
> root = object()
> data = {root: ['Mammal', 'Reptile'], 

What is the advantage, or thougth behind, using an instance
of object as the root of your flat tree ?
egbert

-- 
Egbert Bouwman - Keizersgracht 197 II - 1016 DS  Amsterdam - 020 6257991

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a standard name for this tree structure?

2010-04-04 Thread Patrick Maupin
On Apr 4, 10:41 am, Patrick Maupin  wrote:

> The primary differences between this structure and just haphazardly
> wiring up random objects into a directed graph are that (1) there may
> be some performance differences (but when the garbage collector has to
> figure out how to break cycles, these difference might or might not be
> in the direction you expect) and (2) with this structure, every object
> needs a globally unique name for indexing purposes.

(Just wanted to point out that I realize there aren't any cycles in
Steven's example data -- point (1) this paragraph was about the more
general use of the dict structure for general directed graphs.)

Pat
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a standard name for this tree structure?

2010-04-04 Thread Patrick Maupin
On Apr 4, 9:06 am, Duncan Booth  wrote:
> Do you have any carniverous apes? If so it's a directed acyclic graph.

Well, since he has a root node, he's really only described the *use*
of this data structure implementation for a rooted tree.

As you point out, the implementation itself is more general than a
rooted tree, and could be used for any DAG.  But you don't go far
enough.  This particular implementation could be used for any directed
graph -- there is nothing (other than the problem domain) requiring
that data in this dict be acyclic.  In fact, there are sometimes good
reasons for using this sort of structure for cyclic data.  Having a
name for each node, and always indirectly referring to the node by its
name, can sometimes greatly simplify the implementation of problems
requiring a cyclic graph (starting with the really basic problem of
wanting to dump the whole structure out in a reasonable fashion for
debugging).

The primary differences between this structure and just haphazardly
wiring up random objects into a directed graph are that (1) there may
be some performance differences (but when the garbage collector has to
figure out how to break cycles, these difference might or might not be
in the direction you expect) and (2) with this structure, every object
needs a globally unique name for indexing purposes.

Having said all that, I don't know what the proper name for this data
structure is, either :-)

However, in determining the name, I would probably focus on the data
structure being represented (in Steven's case, a tree), and the data
structure used to represent it (a Python dict), and *why/how* the
underlying structure is used.  So I would probably call it something
like an "associatively addressed tree".

Regards,
Pat
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a standard name for this tree structure?

2010-04-04 Thread Stefan Behnel

Steven D'Aprano, 04.04.2010 14:10:

I have a hierarchical structure something like a directory tree or a
nested tree structure:

Mammal
+-- Ape
:   +-- Chimpanzee
:   +-- Gorilla
:   +-- Human
+-- Carnivore
:   +-- Cat
:   +-- Tiger
Reptile
+-- Lizard
+-- Snake
 +-- Cobra
 +-- Python


Maybe not quite what you asked for, but given the names in the example 
above I would call this an ontology.


Stefan

--
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a standard name for this tree structure?

2010-04-04 Thread Duncan Booth
Steven D'Aprano  wrote:

> I have a hierarchical structure something like a directory tree or a 
> nested tree structure:
> 
> Mammal
> +-- Ape
>:   +-- Chimpanzee
>:   +-- Gorilla
>:   +-- Human
> +-- Carnivore
>:   +-- Cat
>:   +-- Tiger
> Reptile
> +-- Lizard
> +-- Snake
> +-- Cobra
> +-- Python
> 
...snip...
> Is there a particular name for this structure? I've been calling it a 
> dict-based flattened tree, but I can't shake the feeling that there is 
> another name for it...
> 
Do you have any carniverous apes? If so it's a directed acyclic graph.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a standard name for this tree structure?

2010-04-04 Thread Alf P. Steinbach

* Steven D'Aprano:
I have a hierarchical structure something like a directory tree or a 
nested tree structure:


Mammal
+-- Ape
:   +-- Chimpanzee
:   +-- Gorilla
:   +-- Human
+-- Carnivore
:   +-- Cat
:   +-- Tiger
Reptile
+-- Lizard
+-- Snake
+-- Cobra
+-- Python

This is a forest because each top-level element (Mammal, Reptile) is 
itself a tree. By adding a root to the (former) top-level elements, I 
turn it into a single tree.


I can implement this tree using a flat dict:

root = object()
data = {root: ['Mammal', 'Reptile'], 
'Mammal': ['Ape', 'Carnivore'],

'Ape': ['Chimpanzee', 'Gorilla', 'Human'],
'Reptile': ['Lizard', 'Snake'],
# fill in the rest yourself... 
}



Is there a particular name for this structure? I've been calling it a 
dict-based flattened tree, but I can't shake the feeling that there is 
another name for it...


The only difference from an ordinary tree is that you have a provided a way to 
access non-root nodes (with strings as keys) that doesn't work for the root node.


Still, all nodes can be accessed with objects as keys.

So, you have just introduced some ambiguity that allows both views: it's a tree, 
and it's a forest, depending on what point of view one chooses.


In passing, this terminology is the one used in programming and computer 
science.

Donald Knuth notes that for e.g. a binary tree, if one (impractically) adopted 
the terminology of some authors on graph theory then one would have to say 
"topological bifurcating arborescence" instead of "binary tree"...



Cheers & hth.,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list


Is there a standard name for this tree structure?

2010-04-04 Thread Steven D'Aprano
I have a hierarchical structure something like a directory tree or a 
nested tree structure:

Mammal
+-- Ape
:   +-- Chimpanzee
:   +-- Gorilla
:   +-- Human
+-- Carnivore
:   +-- Cat
:   +-- Tiger
Reptile
+-- Lizard
+-- Snake
+-- Cobra
+-- Python

This is a forest because each top-level element (Mammal, Reptile) is 
itself a tree. By adding a root to the (former) top-level elements, I 
turn it into a single tree.

I can implement this tree using a flat dict:

root = object()
data = {root: ['Mammal', 'Reptile'], 
'Mammal': ['Ape', 'Carnivore'],
'Ape': ['Chimpanzee', 'Gorilla', 'Human'],
'Reptile': ['Lizard', 'Snake'],
# fill in the rest yourself... 
}


Is there a particular name for this structure? I've been calling it a 
dict-based flattened tree, but I can't shake the feeling that there is 
another name for it...



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree Structure

2009-10-10 Thread Shakeeb Alireza
> Is there any python class to display the drive and folder structure as
> a tree(As you see in the windows explorer window)??

http://wiki.wxpython.org/TreeControls

S
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree Structure

2009-10-09 Thread r
On Oct 9, 5:02 am, Girish  wrote:
> Is there any python class to display the drive and folder structure as
> a tree(As you see in the windows explorer window)??


You could use a recursive function to print it out of course or you
will need to use a GUI kit. wxPython has a tree widget, i think TIX
has a simplistic tree widget? TIX is included in the stdlib
-- 
http://mail.python.org/mailman/listinfo/python-list


Tree Structure

2009-10-09 Thread Girish
Hello,

Is there any python class to display the drive and folder structure as
a tree(As you see in the windows explorer window)??

Thanks,
Girish..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread Chris Rebert
> On Tue, Jul 7, 2009 at 1:28 AM, Antoine Pitrou  wrote:
>>
>> mayank gupta  gmail.com> writes:
>> >
>> > After a little analysis, I found out that in general it uses about
>> > 1.4 kb of memory for each node!!
>>
>> How did you measure memory use? Python objects are not very compact, but
>> 1.4KB
>> per object seems a bit too much (I would expect more about 150-200
>> bytes/object
>> in 32-bit mode, or 300-400 bytes/object in 64-bit mode).
>>
>> One of the solutions is to use __slots__ as already suggested. Another,
>> which
>> will have similar benefits, is to use a namedtuple. Both suppress the
>> instance
>> dictionnary (`instance`.__dict__), which is a major contributor to memory
>> consumption. Illustration (64-bit mode, by the way):
>>
>> >>> import sys
>> >>> from collections import namedtuple
>>
>> # First a normal class
>> >>> class Node(object): pass
>> ...
>> >>> o = Node()
>> >>> o.value = 1
>> >>> o.children = ()
>> >>>
>> >>> sys.getsizeof(o)
>> 64
>> >>> sys.getsizeof(o.__dict__)
>> 280
>> # The object seems to take a mere 64 bytes, but the attribute dictionnary
>> # adds a whoppy 280 bytes and bumps actual size to 344 bytes!
>>
>> # Now a namedtuple (a tuple subclass with property accessors for the
>> various
>> # tuple items)
>> >>> Node = namedtuple("Node", "value children")
>> >>>
>> >>> o = Node(value=1, children=())
>> >>> sys.getsizeof(o)
>> 72
>> >>> sys.getsizeof(o.__dict__)
>> Traceback (most recent call last):
>>  File "", line 1, in 
>> AttributeError: 'Node' object has no attribute '__dict__'
>>
>> # The object doesn't have a __dict__, so 72 bytes is its real total size.
On Mon, Jul 6, 2009 at 1:30 PM, mayank gupta wrote:
> I worked out a small code which initializes about 1,000,000 nodes with some
> attributes, and saw the memory usage on my linux machine (using 'top'
> command). Then just later I averaged out the memory usage per node. I know
> this is not the most accurate way but just for estimated value.

You should try the more accurate sys.getsizeof() function:
http://docs.python.org/library/sys.html#sys.getsizeof

Cheers,
Chris

P.S. Please don't top-post in the future.
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread mayank gupta
I worked out a small code which initializes about 1,000,000 nodes with some
attributes, and saw the memory usage on my linux machine (using 'top'
command). Then just later I averaged out the memory usage per node. I know
this is not the most accurate way but just for estimated value.

The kind of Node class I am working on in my original code is like :

class Node:
 def __init__(self, #attributes ):
 self.coordinates = coordinates
 self.index = index
 self.sibNum = sibNum
 self.branchNum - branchNum

#here 'coordinates' and 'index' are LISTS with length = "dimension", where
"dimension" is a user-input.

The most shocking part of it after the memory-analysis was that, the memory
usage was never dependent on the "dimension". Yeah it varied a bit, but
there wasnt any significant changes in the memory usage even when the
"dimension" was doubled

-- Any clues?

Thank you for all your suggestions till this point.

Regards.




On Tue, Jul 7, 2009 at 1:28 AM, Antoine Pitrou  wrote:

> mayank gupta  gmail.com> writes:
> >
> > After a little analysis, I found out that in general it uses about
> > 1.4 kb of memory for each node!!
>
> How did you measure memory use? Python objects are not very compact, but
> 1.4KB
> per object seems a bit too much (I would expect more about 150-200
> bytes/object
> in 32-bit mode, or 300-400 bytes/object in 64-bit mode).
>
> One of the solutions is to use __slots__ as already suggested. Another,
> which
> will have similar benefits, is to use a namedtuple. Both suppress the
> instance
> dictionnary (`instance`.__dict__), which is a major contributor to memory
> consumption. Illustration (64-bit mode, by the way):
>
> >>> import sys
> >>> from collections import namedtuple
>
> # First a normal class
> >>> class Node(object): pass
> ...
> >>> o = Node()
> >>> o.value = 1
> >>> o.children = ()
> >>>
> >>> sys.getsizeof(o)
> 64
> >>> sys.getsizeof(o.__dict__)
> 280
> # The object seems to take a mere 64 bytes, but the attribute dictionnary
> # adds a whoppy 280 bytes and bumps actual size to 344 bytes!
>
> # Now a namedtuple (a tuple subclass with property accessors for the
> various
> # tuple items)
> >>> Node = namedtuple("Node", "value children")
> >>>
> >>> o = Node(value=1, children=())
> >>> sys.getsizeof(o)
> 72
> >>> sys.getsizeof(o.__dict__)
> Traceback (most recent call last):
>  File "", line 1, in 
> AttributeError: 'Node' object has no attribute '__dict__'
>
> # The object doesn't have a __dict__, so 72 bytes is its real total size.
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread Antoine Pitrou
mayank gupta  gmail.com> writes:
> 
> After a little analysis, I found out that in general it uses about
> 1.4 kb of memory for each node!!

How did you measure memory use? Python objects are not very compact, but 1.4KB
per object seems a bit too much (I would expect more about 150-200 bytes/object
in 32-bit mode, or 300-400 bytes/object in 64-bit mode).

One of the solutions is to use __slots__ as already suggested. Another, which
will have similar benefits, is to use a namedtuple. Both suppress the instance
dictionnary (`instance`.__dict__), which is a major contributor to memory
consumption. Illustration (64-bit mode, by the way):

>>> import sys
>>> from collections import namedtuple

# First a normal class
>>> class Node(object): pass
... 
>>> o = Node()
>>> o.value = 1
>>> o.children = ()
>>> 
>>> sys.getsizeof(o)
64
>>> sys.getsizeof(o.__dict__)
280
# The object seems to take a mere 64 bytes, but the attribute dictionnary
# adds a whoppy 280 bytes and bumps actual size to 344 bytes!

# Now a namedtuple (a tuple subclass with property accessors for the various
# tuple items)
>>> Node = namedtuple("Node", "value children")
>>> 
>>> o = Node(value=1, children=())
>>> sys.getsizeof(o)
72
>>> sys.getsizeof(o.__dict__)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'Node' object has no attribute '__dict__'

# The object doesn't have a __dict__, so 72 bytes is its real total size.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread Simon Forman
On Mon, Jul 6, 2009 at 6:12 AM, mayank gupta wrote:
> Thanks for the other possibilites. I would consider option (2) and (3) to
> improve my code.
>
> But out of curiosity, I would still like to know why does an object of a
> Python-class consume "so" much of memory (1.4 kb), and this memory usage has
> nothing to do with its attributes.
>
> Thanks
>
> Regards.
>
> On Mon, Jul 6, 2009 at 12:03 PM, Chris Rebert  wrote:
>>
>> On Mon, Jul 6, 2009 at 2:55 AM, mayank gupta wrote:
>> > Hi,
>> >
>> > I am creating a tree data-structure in python; with nodes of the tree
>> > created by a simple class :
>> >
>> > class Node :
>> >    def __init__(self ,  other attributes):
>> >   # initialise the attributes here!!
>> >
>> > But the problem is I am working with a huge tree (millions of nodes);
>> > and
>> > each node is consuming much more memory than it should. After a little
>> > analysis, I found out that in general it uses about 1.4 kb of memory for
>> > each node!!
>> > I will be grateful if someone could help me optimize the memory usage.
>>
>> (1) Use __slots__ (see
>> http://docs.python.org/reference/datamodel.html#slots)
>> (2) Use some data structure other than a tree
>> (3) Rewrite your Node/Tree implementation in C
>>
>> Cheers,
>> Chris
>> --
>> http://blog.rebertia.com
>

For option 2 you should try using the built in types, list tuple or
dict.  You might get better results.

I'm curious too as to why the class/instance code should take so much
memory, could you mention more about the code?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread mayank gupta
Thanks for the other possibilites. I would consider option (2) and (3) to
improve my code.

But out of curiosity, I would still like to know why does an object of a
Python-class consume "so" much of memory (1.4 kb), and this memory usage has
nothing to do with its attributes.

Thanks

Regards.

On Mon, Jul 6, 2009 at 12:03 PM, Chris Rebert  wrote:

> On Mon, Jul 6, 2009 at 2:55 AM, mayank gupta wrote:
> > Hi,
> >
> > I am creating a tree data-structure in python; with nodes of the tree
> > created by a simple class :
> >
> > class Node :
> >def __init__(self ,  other attributes):
> >   # initialise the attributes here!!
> >
> > But the problem is I am working with a huge tree (millions of nodes); and
> > each node is consuming much more memory than it should. After a little
> > analysis, I found out that in general it uses about 1.4 kb of memory for
> > each node!!
> > I will be grateful if someone could help me optimize the memory usage.
>
> (1) Use __slots__ (see
> http://docs.python.org/reference/datamodel.html#slots)
> (2) Use some data structure other than a tree
> (3) Rewrite your Node/Tree implementation in C
>
> Cheers,
> Chris
> --
> http://blog.rebertia.com
>



-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread Chris Rebert
On Mon, Jul 6, 2009 at 2:55 AM, mayank gupta wrote:
> Hi,
>
> I am creating a tree data-structure in python; with nodes of the tree
> created by a simple class :
>
> class Node :
>    def __init__(self ,  other attributes):
>   # initialise the attributes here!!
>
> But the problem is I am working with a huge tree (millions of nodes); and
> each node is consuming much more memory than it should. After a little
> analysis, I found out that in general it uses about 1.4 kb of memory for
> each node!!
> I will be grateful if someone could help me optimize the memory usage.

(1) Use __slots__ (see http://docs.python.org/reference/datamodel.html#slots)
(2) Use some data structure other than a tree
(3) Rewrite your Node/Tree implementation in C

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Tree structure consuming lot of memory

2009-07-06 Thread mayank gupta
Hi,

I am creating a tree data-structure in python; with nodes of the tree
created by a simple class :

class Node :
   def __init__(self ,  other attributes):
  # initialise the attributes here!!

But the problem is I am working with a huge tree (millions of nodes); and
each node is consuming much more memory than it should. After a little
analysis, I found out that in general it uses about 1.4 kb of memory for
each node!!
I will be grateful if someone could help me optimize the memory usage.
Thanks.

Regards,
Mayank


-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to request data from a lazily-created tree structure ?

2008-08-02 Thread méchoui
On 17 juin, 13:53, méchoui <[EMAIL PROTECTED]> wrote:
> On Jun 17, 9:08 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
>
>
>
> > > Yes, I need to make sure my requests are properly written so that the
> > > generic XPath engine does not need all the structure in memory.
>
> > > There are quite a few cases where you really don't need to load
> > > everything at all. /a/b/*/c/d is an example. But even with an example
> > > like /x/z[last()]/t, you don't need to load everything under the
> > > every /x/z nodes. You just need to check for the latest one, and make
> > > sure there is a t node under it.
>
> > > Anyway, if I need to make requests that need all the data... that
> > > means that the need for lazy instantiation of nodes disappears,
> > > right ?
>
> > Yes. And unless you have memory-constraints I have to admit that I
> > really doubt that the parsing overhead isn't by far exceeded by the
> > network latency.
>
> > Diez
>
> Do you know if there is such XPath engine that can be applied to a DOM-
> like structure ?
>
> One way would be to take an XPath engine from an existing XML engine
> (ElementTree, or any other), and see what APIs it calls... and see if
> we cannot create a DOM-like structure that has the same API. Duck
> typing, really...

I have something that works. 
http://lauploix.blogspot.com/2008/07/xpath-for-my-trees.html
It has the pro and cons of the ElementTree 1.3 XPath engine, but it
works quite nice.

Laurent Ploix
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to request data from a lazily-created tree structure ?

2008-06-23 Thread méchoui
On Jun 17, 11:54 pm, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
> > Do you know if there is suchXPathengine that can be applied to a DOM-
> > like structure ?
>
> No. But I toyed with the idea to write one :)
>
> > One way would be to take anXPathengine from an existing XML engine
> > (ElementTree, or any other), and see what APIs it calls... and see if
> > we cannot create a DOM-like structure that has the same API. Duck
> > typing, really...
>
> Why can't you create a *real* DOM?
>
> Diez

I may have found sg: http://sourceforge.net/projects/pdis-xpath/

A XPath 1.0, in pure python, on top of ElementTree. I'll have a look.
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to request data from a lazily-created tree structure ?

2008-06-20 Thread méchoui
On Jun 17, 10:54 pm, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
> > Do you know if there is such XPath engine that can be applied to a DOM-
> > like structure ?
>
> No. But I toyed with the idea to write one :)
>
> > One way would be to take an XPath engine from an existing XML engine
> > (ElementTree, or any other), and see what APIs it calls... and see if
> > we cannot create a DOM-like structure that has the same API. Duck
> > typing, really...
>
> Why can't you create a *real* DOM?
>
> Diez

I don't know what "real" means, in fact. In python, being a "real" sg
is all about having the same interface, right? May be I did not
undertand what you meant.

I cannot load all the data in memory before I request it, because it
would take too long. If using XPath-like tools requires that I load
the data in memory, I'd better create my own algorithm instead. It
will be much faster.

What I mean it: if I have a XPath engine that works well on a specific
DOM-like structure... may be I can create my own DOM-lile structure to
fool the XPath engine; so that I can use it on my own structure.
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to request data from a lazily-created tree structure ?

2008-06-17 Thread Diez B. Roggisch



Do you know if there is such XPath engine that can be applied to a DOM-
like structure ?


No. But I toyed with the idea to write one :)


One way would be to take an XPath engine from an existing XML engine
(ElementTree, or any other), and see what APIs it calls... and see if
we cannot create a DOM-like structure that has the same API. Duck
typing, really...



Why can't you create a *real* DOM?

Diez
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to request data from a lazily-created tree structure ?

2008-06-17 Thread méchoui
On Jun 17, 9:08 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
> > Yes, I need to make sure my requests are properly written so that the
> > generic XPath engine does not need all the structure in memory.
>
> > There are quite a few cases where you really don't need to load
> > everything at all. /a/b/*/c/d is an example. But even with an example
> > like /x/z[last()]/t, you don't need to load everything under the
> > every /x/z nodes. You just need to check for the latest one, and make
> > sure there is a t node under it.
>
> > Anyway, if I need to make requests that need all the data... that
> > means that the need for lazy instantiation of nodes disappears,
> > right ?
>
> Yes. And unless you have memory-constraints I have to admit that I
> really doubt that the parsing overhead isn't by far exceeded by the
> network latency.
>
> Diez

Do you know if there is such XPath engine that can be applied to a DOM-
like structure ?

One way would be to take an XPath engine from an existing XML engine
(ElementTree, or any other), and see what APIs it calls... and see if
we cannot create a DOM-like structure that has the same API. Duck
typing, really...
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to request data from a lazily-created tree structure ?

2008-06-17 Thread Diez B. Roggisch


Yes, I need to make sure my requests are properly written so that the
generic XPath engine does not need all the structure in memory.

There are quite a few cases where you really don't need to load
everything at all. /a/b/*/c/d is an example. But even with an example
like /x/z[last()]/t, you don't need to load everything under the
every /x/z nodes. You just need to check for the latest one, and make
sure there is a t node under it.

Anyway, if I need to make requests that need all the data... that
means that the need for lazy instantiation of nodes disappears,
right ?



Yes. And unless you have memory-constraints I have to admit that I 
really doubt that the parsing overhead isn't by far exceeded by the 
network latency.


Diez
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to request data from a lazily-created tree structure ?

2008-06-16 Thread méchoui
On Jun 16, 11:16 pm, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
> méchoui schrieb:
>
>
>
> > Problem:
>
> > - You have tree structure (XML-like) that you don't want to create
> > 100% in memory, because it just takes too long (for instance, you need
> > a http request to request the information from a slow distant site).
> > - But you want to be able to request data from it, such has "give me
> > all nodes that are under a "//foo/bar" tree, and have a child with an
> > "baz" attribute of value "zzz".
>
> > Question :
>
> > Do you have any other idea to request data from a lazily-created tree
> > structure ?
>
> > And does it make sense to create a DOM-like structure and to use a
> > generic XPath engine to request the tree ? (and does this generic
> > XPath engine exist ?)
>
> > The idea is to have the tree structure created on the fly (we are in
> > python), only when the XPath engine requests the data. Hopefully the
> > XPath engine will not request all the data from the tree (if the
> > request is smart enough and does not contain **, for instance).
>
> Generic XPath works only with a DOM(like) structure. How else would you
> e.g. evaluate an expression like foo[last()]?
>
> So if you really need lazy evaluation, you will need to specifically
> analyze the query of interest and see if it can be coded in a way that
> allows to forget as much of the tree as possible, or even better not
> query it.
>
> Diez

Yes, I need to make sure my requests are properly written so that the
generic XPath engine does not need all the structure in memory.

There are quite a few cases where you really don't need to load
everything at all. /a/b/*/c/d is an example. But even with an example
like /x/z[last()]/t, you don't need to load everything under the
every /x/z nodes. You just need to check for the latest one, and make
sure there is a t node under it.

Anyway, if I need to make requests that need all the data... that
means that the need for lazy instantiation of nodes disappears,
right ?
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to request data from a lazily-created tree structure ?

2008-06-16 Thread Diez B. Roggisch

méchoui schrieb:

Problem:

- You have tree structure (XML-like) that you don't want to create
100% in memory, because it just takes too long (for instance, you need
a http request to request the information from a slow distant site).
- But you want to be able to request data from it, such has "give me
all nodes that are under a "//foo/bar" tree, and have a child with an
"baz" attribute of value "zzz".

Question :

Do you have any other idea to request data from a lazily-created tree
structure ?

And does it make sense to create a DOM-like structure and to use a
generic XPath engine to request the tree ? (and does this generic
XPath engine exist ?)

The idea is to have the tree structure created on the fly (we are in
python), only when the XPath engine requests the data. Hopefully the
XPath engine will not request all the data from the tree (if the
request is smart enough and does not contain **, for instance).


Generic XPath works only with a DOM(like) structure. How else would you 
e.g. evaluate an expression like foo[last()]?



So if you really need lazy evaluation, you will need to specifically 
analyze the query of interest and see if it can be coded in a way that 
allows to forget as much of the tree as possible, or even better not 
query it.


Diez
--
http://mail.python.org/mailman/listinfo/python-list


How to request data from a lazily-created tree structure ?

2008-06-16 Thread méchoui
Problem:

- You have tree structure (XML-like) that you don't want to create
100% in memory, because it just takes too long (for instance, you need
a http request to request the information from a slow distant site).
- But you want to be able to request data from it, such has "give me
all nodes that are under a "//foo/bar" tree, and have a child with an
"baz" attribute of value "zzz".

Question :

Do you have any other idea to request data from a lazily-created tree
structure ?

And does it make sense to create a DOM-like structure and to use a
generic XPath engine to request the tree ? (and does this generic
XPath engine exist ?)

The idea is to have the tree structure created on the fly (we are in
python), only when the XPath engine requests the data. Hopefully the
XPath engine will not request all the data from the tree (if the
request is smart enough and does not contain **, for instance).

Thanks
--
http://mail.python.org/mailman/listinfo/python-list