Re: [Tutor] XML node name and property

2006-03-31 Thread Kent Johnson
Keo Sophon wrote:
 Hi all,
 
 How can I get a name of an XML node and and its property name and its 
 property 
 value?

How are your reading the XML? (xml.dom, ElementTree, BeautifulSoup...)

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] getAttribute

2006-03-31 Thread kakada
Hello all,

Could anyone help me by giving example of getAttribute

xmldata = ziparchive.read(content.xml)
xmldoc = minidom.parseString(xmldata)
officeNode = xmldoc.getElementsByTagName('office:text')

I have another function:
def go_thru (nodelist):
for x in range(nodelist.length):
node = nodelist[x]
i = 0
print node.getAttribute('text:style-name')

go_thru(officeNode) #Calling function

I always get error message: AttributeError: Text instance has no
attribute 'getAttribute'
Could anyone give an explanation?

Thanks a lot,

da

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] XML node name and property

2006-03-31 Thread Keo Sophon
On Friday 31 March 2006 17:37, Kent Johnson wrote:
 Keo Sophon wrote:
  Hi all,
 
  How can I get a name of an XML node and and its property name and its
  property value?

 How are your reading the XML? (xml.dom, ElementTree, BeautifulSoup...)

 Kent

 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor


I am using xml.dom.

Phon
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] how to get the content of an XML elements.

2006-03-31 Thread Keo Sophon
Hi all,

Is there anyway to get the content of an XML elements. I am using xml.dom.

Thanks,
Phon
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Multi-thread environments

2006-03-31 Thread Liam Clarke
Thanks very much for that Kent, works fine and dandy now. _ This is
one to chalk up to experience. I copied the dicts as you said.

Regards,

Liam


On 3/31/06, Kent Johnson [EMAIL PROTECTED] wrote:
 Liam Clarke wrote:
  Hi all,
 
  I'm working in my first multi-threaded environments, and I think I
  might have just been bitten by that.
 
  class Parser:
  def __init__(self, Q):
  self.Q = Q
  self.players = {}
  self.teams = {}
 
  def sendData(self):
  if not self.players or not self.teams: return
  self.Q.put((self.players, self.teams))
  self.resetStats()
 
  def resetStats():
  for key in self.players:
  self.players[key] = 0
  for key in self.teams:
  self.teams[key] = 0
 

  What I'm finding is that if a lot more sets of zeroed data are being
  sent to the DAO than should occur.
 
  If the resetStats() call is commented out, data is sent correctly. I
  need to reset the variables after each send so as to not try and
  co-ordinate state with a database, otherwise I'd be away laughing.
 
  My speculation is that because the Queue is shared between two
  threads, one of which is looping on it, that a data write to the Queue
  may actually occur after the next method call, the resetStats()
  method, has occurred.
 
  So, the call to Queue.put() is made, but the actual data is accessedin
  memory by the Queue after resetStats has changed it.

 You're close. The call to Queue.put() is synchronous - it will finish
 before the call to resetStats() is made - but the *data* is still shared.

 What is in the Queue is references to the dicts that is also referenced
 by self.players and self.teams. The actual dict is not copied! This is
 normal Python function call and assignment semantics, but in this case
 it's not what you want. You have a race condition - if the data in the
 Queue is processed before the call to resetStats() is made, it will work
 fine; if resetStats() is called first, it will be a problem. Actually
 there are many possible failures since resetStats() loops over the
 dicts, the consumer could be interleaving its reads with the writes in
 resetStats().

 What you need to do is copy the data, either before you put it in the
 queue or as part of the reset. I suggest rewriting resetStats() to
 create new dicts because dict.fromkeys() will do just what you want:
def resetStats():
  self.players = dict.fromkeys(self.players.keys(), 0)
  self.teams = dict.teams(self.players.keys(), 0)

 This way you won't change the data seen by the consumer thread.

  I've spent about eight hours so far trying to debug this; I've never
  been this frustrated in a Python project before to be honest... I've
  reached my next skill level bump, so to speak.

 Yes, threads can be mind-bending until you learn to spot the gotchas
 like this.

 By the way you also have a race condition here:
  if self.dump:
  self.parser.sendDat()
  self.dump = False

 Possibly the thread that sets self.dump will set it again between the
 time you test it and when you reset it. If the setting thread is on a
 timer and the time is long enough, it won't be a problem, but it is a
 potential bug.

 Kent

 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] XML node name and property

2006-03-31 Thread Kent Johnson
Keo Sophon wrote:
 On Friday 31 March 2006 17:37, Kent Johnson wrote:
 
Keo Sophon wrote:

How can I get a name of an XML node and and its property name and its
property value?

How are your reading the XML? (xml.dom, ElementTree, BeautifulSoup...)

 
 I am using xml.dom.

I generally use ElementTree for XML access, it is much simpler than 
xml.dom. But here is what I figured out:
The tag name is node.nodeName

If by property you mean attribute, you can get a list of all attributes 
by node.attributes.keys() and access the value of a particular one by
node.attributes['attributeName'].nodeValue

Google python xml.dom to find many more examples. Or Google ElementTree 
to find an easier way to do it...

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] getAttribute

2006-03-31 Thread Kent Johnson
kakada wrote:
 Hello all,
 
 Could anyone help me by giving example of getAttribute
 
 xmldata = ziparchive.read(content.xml)
 xmldoc = minidom.parseString(xmldata)
 officeNode = xmldoc.getElementsByTagName('office:text')
 
 I have another function:
 def go_thru (nodelist):
 for x in range(nodelist.length):
 node = nodelist[x]
 i = 0
 print node.getAttribute('text:style-name')
 
 go_thru(officeNode) #Calling function
 
 I always get error message: AttributeError: Text instance has no
 attribute 'getAttribute'
 Could anyone give an explanation?

I think it should be node.attributes['text:style-name'].nodeValue.

You might want to look at ElementTree, it is a more Pythonic XML library.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Looking for Constructs to Remove Redundant Code

2006-03-31 Thread Ilias Lazaridis
I have this python code

class Car:
 Type of car.

 manufacturer = f.string()
 model = f.string()
 modelYear = f.integer()

 _key(manufacturer, model, modelYear)

 def __str__(self):
 return '%s %s %s' % (self.modelYear, self.manufacturer, self.model)

-

and would like to see it e.g. this way:

class Car:
 Type of car.

 manufacturer = f.string(true, str=2)
 model = f.string(true, str=3)
 modelYear = f.integer(true, str=1)

-

how would the factory method look like?

def string(self, key, str )
 # create somehow the __str__ function
 # create somehow the key

.

-- 
http://lazaridis.com

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Looking for Constructs to Remove Redundant Code

2006-03-31 Thread Kent Johnson
Ilias Lazaridis wrote:
 I have this python code
 
 class Car:
  Type of car.
 
  manufacturer = f.string()
  model = f.string()
  modelYear = f.integer()
 
  _key(manufacturer, model, modelYear)
 
  def __str__(self):
  return '%s %s %s' % (self.modelYear, self.manufacturer, self.model)

What is f.string()? What is _key()? Are you using a metaclass here? Did 
you intentionally omit an __init__() method? If this is working code 
there is a lot you are not showing.

 and would like to see it e.g. this way:
 
 class Car:
  Type of car.
 
  manufacturer = f.string(true, str=2)
  model = f.string(true, str=3)
  modelYear = f.integer(true, str=1)
 
 -
 
 how would the factory method look like?
 
 def string(self, key, str )
  # create somehow the __str__ function
  # create somehow the key

This would go in your metaclass __init__ I think. But hard to say 
without more details.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] BeautifulSoup - getting cells without new line characters

2006-03-31 Thread jonasmg
 From a table, I want to get the cells for then only choose some of them. 

table
tr
tdWY/td
tdWyo./td
/tr
...
/table 

Using: 

for row in table('tr'): print row.contents 

   ['\n', tdWY/td, '\n', tdWyo./td, '\n']
   [...] 

I get a new line character between each cell. 

Is possible get them without those '\n'? 

Thanks in advance! 
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] BeautifulSoup - getting cells without new line characters

2006-03-31 Thread Kent Johnson
[EMAIL PROTECTED] wrote:
  From a table, I want to get the cells for then only choose some of them. 
 
 table
 tr
 tdWY/td
 tdWyo./td
 /tr
 ...
 /table 
 
 Using: 
 
 for row in table('tr'): print row.contents 
 
['\n', tdWY/td, '\n', tdWyo./td, '\n']
[...] 
 
 I get a new line character between each cell. 
 
 Is possible get them without those '\n'? 

Well, the newlines are in your data, so you need to strip them or ignore 
them somewhere.

You don't say what you are actually trying to do, maybe this is close:
   for row in table('tr'):
 cellText = [cell.string for cell in row('td')]
 print ' '.join(cellText)

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] BeautifulSoup - getting cells without new line characters

2006-03-31 Thread jonasmg
Kent Johnson writes: 

 [EMAIL PROTECTED] wrote:
  From a table, I want to get the cells for then only choose some of them.  
 
 table
 tr
 tdWY/td
 tdWyo./td
 /tr
 ...
 /table  
 
 Using:  
 
 for row in table('tr'): print row.contents  
 
['\n', tdWY/td, '\n', tdWyo./td, '\n']
[...]  
 
 I get a new line character between each cell.  
 
 Is possible get them without those '\n'? 
 
 Well, the newlines are in your data, so you need to strip them or ignore 
 them somewhere. 
 
 You don't say what you are actually trying to do, maybe this is close:
for row in table('tr'):
  cellText = [cell.string for cell in row('td')]
  print ' '.join(cellText) 
 
 Kent 
 
 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor

I want only (for each row) to get some positions (i.e. 
row.contents[0],row.contents[2]) 
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] BeautifulSoup - getting cells without new line characters

2006-03-31 Thread jonasmg
Kent Johnson writes: 

 [EMAIL PROTECTED] wrote:
 You have reason but the problem is that some cells have anchors.
 Sorry, I forgot myself to say it.  
 
 and using:  
 
 for row in table('tr'):
 cellText = [cell.string for cell in row('td')]
 print cellText  
 
 I get null values in cell with anchors. 
 
 Can you give an example of your actual data and the result you want to 
 generate from it? I can't give you a correct answer if you don't tell me 
 the real question. 
 
 Kent 
 
 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor

List of states:
http://en.wikipedia.org/wiki/U.S._state 

: soup = BeautifulSoup(html)
: # Get the second table (list of states).
: table = soup.first('table').findNext('table')
: print table 

...
tr
tdWY/td
tdWyo./td
tda href=/wiki/Wyoming title=WyomingWyoming/a/td
tda href=/wiki/Cheyenne%2C_Wyoming title=Cheyenne, 
WyomingCheyenne/a/td
tda href=/wiki/Cheyenne%2C_Wyoming title=Cheyenne, 
WyomingCheyenne/a/td
tda href=/wiki/Image:Flag_of_Wyoming.svg class=image title=img 
src=http://upload.wikimedia.org/wikipedia/commons/thumb/b/bc/Flag_of_Wyomin 
g.svg/45px-Flag_of_Wyoming.svg.png width=45 alt= height=30 
longdesc=/wiki/Image:Flag_of_Wyoming.svg //a/td
/tr
/table 

Of each row (tr), I want to get the cells (td): 1,3,4 
(postal,state,capital). But cells 3 and 4 have anchors. 

Thanks Kent. 
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] BeautifulSoup - getting cells without new line characters

2006-03-31 Thread Kent Johnson
[EMAIL PROTECTED] wrote:

 List of states:
 http://en.wikipedia.org/wiki/U.S._state 
 
 : soup = BeautifulSoup(html)
 : # Get the second table (list of states).
 : table = soup.first('table').findNext('table')
 : print table 
 
 ...
 tr
 tdWY/td
 tdWyo./td
 tda href=/wiki/Wyoming title=WyomingWyoming/a/td
 tda href=/wiki/Cheyenne%2C_Wyoming title=Cheyenne, 
 WyomingCheyenne/a/td
 tda href=/wiki/Cheyenne%2C_Wyoming title=Cheyenne, 
 WyomingCheyenne/a/td
 tda href=/wiki/Image:Flag_of_Wyoming.svg class=image title=img 
 src=http://upload.wikimedia.org/wikipedia/commons/thumb/b/bc/Flag_of_Wyomin 
 g.svg/45px-Flag_of_Wyoming.svg.png width=45 alt= height=30 
 longdesc=/wiki/Image:Flag_of_Wyoming.svg //a/td
 /tr
 /table 
 
 Of each row (tr), I want to get the cells (td): 1,3,4 
 (postal,state,capital). But cells 3 and 4 have anchors. 

So dig into the cells and get the data from the anchor.

cells = row('td')
cells[0].string
cells[2]('a').string
cells[3]('a').string

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Apple Remote Mouse

2006-03-31 Thread w chun
 ... this seems to me to be the kind of query where
 you could legitimately post to the main Python
 newsgroup / mailing list and/or to some Mac-specific
 one, if there is such a thing.

... and there is:

http://mail.python.org/mailman/listinfo/pythonmac-sig

cheers,
-- wesley
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Core Python Programming, Prentice Hall, (c)2007,2001
http://corepython.com

wesley.j.chun :: wescpy-at-gmail.com
cyberweb.consulting : silicon valley, ca
http://cyberwebconsulting.com
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] BeautifulSoup - getting cells without new line characters

2006-03-31 Thread jonasmg
Kent Johnson writes: 

 [EMAIL PROTECTED] wrote: 
 
 List of states:
 http://en.wikipedia.org/wiki/U.S._state  
 
 : soup = BeautifulSoup(html)
 : # Get the second table (list of states).
 : table = soup.first('table').findNext('table')
 : print table  
 
 ...
 tr
 tdWY/td
 tdWyo./td
 tda href=/wiki/Wyoming title=WyomingWyoming/a/td
 tda href=/wiki/Cheyenne%2C_Wyoming title=Cheyenne, 
 WyomingCheyenne/a/td
 tda href=/wiki/Cheyenne%2C_Wyoming title=Cheyenne, 
 WyomingCheyenne/a/td
 tda href=/wiki/Image:Flag_of_Wyoming.svg class=image title=img 
 src=http://upload.wikimedia.org/wikipedia/commons/thumb/b/bc/Flag_of_Wyomin 
 g.svg/45px-Flag_of_Wyoming.svg.png width=45 alt= height=30 
 longdesc=/wiki/Image:Flag_of_Wyoming.svg //a/td
 /tr
 /table  
 
 Of each row (tr), I want to get the cells (td): 1,3,4 
 (postal,state,capital). But cells 3 and 4 have anchors. 
 
 So dig into the cells and get the data from the anchor. 
 
 cells = row('td')
 cells[0].string
 cells[2]('a').string
 cells[3]('a').string 
 
 Kent 
 
 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor

for row in table('tr'):
   cells = row('td')
   print cells[0] 

IndexError: list index out of range 
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Inverted Index Algorithm

2006-03-31 Thread Kent Johnson
Steve Nelson wrote:
 Hello All,
 
 I've been reading about Inverted Indexing - I'd like to try to write
 something in Python that illustrates the concpet, as I've got to give
 a presentation about it.
 
 Where would be a good place to start?

Do you need help getting started with Python or with inverted indexing 
in particular?

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] removing file from zip archive.

2006-03-31 Thread w chun
 How can we remove one file inside of a zip archive?

 import zipfile
 ziparchive = zipfile.ZipFile('test.odt', 'r')
 xmldata = ziparchive.read('content.xml')
 ziparchive.close --- ADD ( ) HERE TOO


Sophon,

You can remove any number of files from a ZIP file, but it has to be
processed manually by you.  When you read() a file from a ZIP archive,
you actually have all the data with you, i.e. xmldata.

All you have to do is to open another file to write it out to disk, i.e.,

f = open('content.xml', 'w')
f.write(xmldata)
f.close()

hope this helps!
-- wesley
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Core Python Programming, Prentice Hall, (c)2007,2001
http://corepython.com

wesley.j.chun :: wescpy-at-gmail.com
cyberweb.consulting : silicon valley, ca
http://cyberwebconsulting.com
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Looking for Constructs to Remove Redundant Code

2006-03-31 Thread Ilias Lazaridis
Kent Johnson wrote:
...

Thank you for your comments. I realize that my request was not very 
clear. I make a 2nd attemp, more simplified:

I have this python code:

class Car(BaseClass) :
  manufacturer = stringFactory()
  model = stringFactory()
  modelYear = integerFactory()

  def __str__(self):
  return '%s %s %s' % (self.modelYear, self.manufacturer, 
self.model)

def stringFactory(self)
 s = String() # creates a string object
 #... # does several things
 return s # returns the string object

-

and would like to see it e.g. this way:

class Car(BaseClass):
  manufacturer = stringFactory(2)
  model = stringFactory(3)
  modelYear = integerFactory(1)

def stringFactory(self, position)
 s = String() # creates a string object
 ...  # does several things
  # creates somehow the __str__ functionality... 

 return s # returns the string object

-

hope this is now more clear.

.

-- 
http://lazaridis.com

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python + PostGreSQL

2006-03-31 Thread Bill Campbell
On Fri, Mar 31, 2006, Srinivas Iyyer wrote:
Dear group, 

I want to connect python to postgresql. 
My python dist. is 2.4.2
My postgres: 8.1.2
My system: Linux Enterprise Linux, Intel Xeon, 4GB
RAM.

I tried to install pygresql: version: 3.8, it failed
throwing exception : Exception: pg_config tool is not
available.

I gave another try on google and Postgres site and
found Pypgsql, PoPy and psycopg1. 

I think that psycopg is generally considered the preferred
package.  I have been using it with several systems including
Zope, and sqlobject.  So far I haven't tried psycopg2.

Bill
--
INTERNET:   [EMAIL PROTECTED]  Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/  PO Box 820; 6641 E. Mercer Way
FAX:(206) 232-9186  Mercer Island, WA 98040-0820; (206) 236-1676

A child can go only so far in life without potty training.  It is not
mere coincidence that six of the last seven presidents were potty
trained, not to mention nearly half of the nation's state legislators.
-- Dave Barry
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] request for sugestions on fragement of code for generating truth-tables

2006-03-31 Thread Danny Yoo


 Then, the output is like so:

   atoms = [a,b,c]
   tvas = tva_dict_maker(atoms)
   display_tvas(tvas)
 a:Trueb:True  c:True
 a:Trueb:True  c:False
 a:Trueb:False c:True
 a:Trueb:False c:False
 a:False   b:True  c:True
 a:False   b:True  c:False
 a:False   b:False c:True
 a:False   b:False c:False

Hi Brian,

We might be able to take advantage of the recursive nature of this
problem.

I'll sketch out the idea and try to fight the temptation to write it out
in full.  *grin* If you haven't encountered recursion before, please shout
out and ask for more details.


When we look above, we're looking at the solution for tva_dict_maker-ing
the list ['a', 'b', 'c'].

But let's imagine what tva_dict_maker() looks like for a slightly smaller
problem, for ['b', 'c']:

  b:Truec:True
  b:Truec:False
  b:False   c:True
  b:False   c:False


If we look at this and use our pattern-matching abilities, we might say
that the solution for ['b', 'c'] really looks like half of the solution
for ['a', 'b', 'c'] That is, to get:

tva_dict_maker(['a', 'b', 'c'])

all we really need are the results of tva_dict_maker(['b', 'c']).  We can
then twiddle two copies of the samller solution to make 'a' either True or
False, combine them together, and we've got it.


Recursive approachs have two parts to them, a base case and an
inductive case:

1.  Figure out solutions for really simple examples.  For example, in
the problem above, We know that something like:

tva_dict_maker(['c'])

has a very simple solution, since we're only dealing with a list
of one element.  ([{'c': True}, {'c' : False}])

2.  And for larger problems, let's figure out a way to break the
problem into something smaller.  We might be able to take the
solution of the smaller problem and make it scale up.


So a recursive approach will typically fit some template like this:

## Pseduocode #
def recursive_problem(some_problem):
if some_problem is really obviously simple:
return the obviously simple answer to some_problem
otherwise:
smaller_problem = some way of making some_problem slightly
  smaller.  On lists, usually attack list[1:].
small_solution = recursive_problem(smaller_problem)
full_solution = twiddle small_solution somehow to make it solve
some_problem
return full_solution
###


If you have more questions, please feel free to ask!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python + PostGreSQL

2006-03-31 Thread w chun
hi srini,

i don't know what your system configuration was like for the
installation, but from what i saw in your post (below), it just seems
like /usr/local/pgsql/bin was not in your path since it looks like
sh could not find the pg_config command, not Python (which choked
afterwards).

anyway, if you got psycopg working for you, then just leave it. :-)

cheers,
-wesley

 sh: pg_config: command not found
:
 My pg_config is in : /usr/local/pgsql/bin/pg_config
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Inverted Index Algorithm

2006-03-31 Thread Danny Yoo


 The next step would be to introduce an index.  I think again, the
 simplest thing that could possibly work would be a literal index of
 every word and every document in which it appears.  This would save
 processing time, but wouldn't be very intelligent.

Yes, that's right, that's the idea of an inverted index.



 This is where I think the inverted indexing comes in.  As I understand
 it we can now produce an index of key words, with document name and
 location in document for each key word. This makes the search more
 involved, and more intelligent.  Finally we could have some logic that
 did some set analysis to return only results that make sense.

Location information would help allow you to do things like phrase or
proximity matching.  Another thing that might help is term frequency (tf).


You might want to check out documentation about Lucene:

http://lucene.apache.org/java/docs/index.html

as they're the premier open source search library.  They have a
presentation that gives a good overview of the techniques used in a fast
search engine:

http://lucene.sourceforge.net/talks/inktomi/


If you want to reuse their engine, the OSAF folks have even written Python
bindings to the library:

http://pylucene.osafoundation.org/


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor