hidden built-in module

2008-03-07 Thread koara
Hello, is there a way to access a module that is hidden because
another module (of the same name) is found first?

More specifically, i have my own logging.py module, and inside this
module, depending on how initialization goes,  i may want to do 'from
logging import *' from the built-in logging.

I hope my description was clear, cheers.

I am using python2.4.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hidden built-in module

2008-03-07 Thread koara
On Mar 5, 1:39 pm, gigs [EMAIL PROTECTED] wrote:
 koara wrote:
  Hello, is there a way to access a module that is hidden because
  another module (of the same name) is found first?

  More specifically, i have my own logging.py module, and inside this
  module, depending on how initialization goes,  i may want to do 'from
  logging import *' from the built-in logging.

  I hope my description was clear, cheers.

  I am using python2.4.

 you can add your own logging module in extra directory that have __init__.py 
 and
 import it like: from extradirectory.logging import *

 and builtin: from logging import *


Thank you for your reply gigs. However, the point of this namespace
harakiri is that existing code which uses 'import logging' ...
'logging.info()'... etc. continues working without any change.
Renaming my logging.py file is not an option -- if it were, i wouldn't
bother naming my module same as a built-in :-)

Cheers.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: hidden built-in module

2008-03-07 Thread koara
 You can only try and search the sys-path for the logging-module, using

 sys.prefix

 and then look for logging.py. Using

 __import__(path)

 you get a reference to that module.

 Diez


Thank you Diez, that's the info i'd been looking for :-)

So the answer is sys module + __import__

Cheers!
-- 
http://mail.python.org/mailman/listinfo/python-list


mmap disk performance

2007-11-20 Thread koara
Hello all,

i am using the mmap module (python2.4) to access contents of a file.

My question regards the relative performance of mmap.seek() vs
mmap.tell(). I have a generator that returns stuff from the file,
piece by piece. Since other things may happen to the mmap object in
between consecutive next() calls (such as another iterator's next()),
i have to store the file position before yield and restore it
afterwards by means of tell() and seek(). Is this correct?

When restoring, is there a penalty for mmap.seek(pos) where the file
position is already at pos (i.e., nothing happened to the file
position in between, a common scenario)? If there is, is it worth
doing

if mmap.tell() != pos:
mmap.seek(pos)

or such?

Cheers!
-- 
http://mail.python.org/mailman/listinfo/python-list


urllib.unquote + unicode

2007-11-13 Thread koara
Hello all,

i am using urllib.unquote_plus to unquote a string. Sometimes i get a
strange string like for example spolu%u017E%E1ci.cz to unquote. Here
the problem is that some application decided to quote a non-ascii
character as %u directly, instead of using an encoding and quoting
byte per byte.

Python (2.4.1) simply returns 'spolu%u017E\xe1ci.cz, which is likely
not what the application meant.

My question is, is this %u quoting a standard (i.e., urllib is in the
wrong), is it not (i.e., the application is in the wrong and urllib
silently ignores the '%u0' - why?), and most importantly, is there a
simple workaround to get it working as expected?

Cheers!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: enumerate overflow

2007-10-03 Thread koara
On Oct 3, 7:22 pm, Raymond Hettinger [EMAIL PROTECTED] wrote:
 In Py2.6, I will mostly likely put in an automatic promotion to long
 for both enumerate() and count().  It took a while to figure-out how
 to do this without killing the performance for normal cases (ones used
 in real programs, not examples contrived to say, omg, see what
 *could* happen).

 Raymond


Thanks everybody for the reply and suggestions, I'm glad to see the
issues's already been discovered/discussed/almostresolved.

By the way, I do not consider my programs in any way 'unreal'.

-- 
http://mail.python.org/mailman/listinfo/python-list


unicode categories -- regex

2007-09-22 Thread koara
Hello all -- my question regards special meta characters for the re
module. I saw in the re module documentation about the possibility to
abstract to any alphanumeric unicode character with '\w'. However,
there was no info on constructing patterns for other unicode
categories, such as purely alphabetical characters, or punctuation
symbols etc.

I found that this category information actually IS available in python
-- in the standard module unicodedata. For example,
unicodedata.category(u'.') gives 'Po' for 'Punctuation, other' etc.

So how do i include this information in regular pattern search? Any
ideas? Thanks.


I'm talking about python2.5 here.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode categories -- regex

2007-09-22 Thread koara
 At the moment, you have to generate a character class for this yourself,
 e.g.
 ...


Thank you Martin, this is exactly what i wanted to know.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: memory efficient set/dictionary

2007-06-11 Thread koara
  I would recommend you to use a database since it meets your
  requirements (off-memory, fast, persistent). The bsdddb module
  (berkeley db) even gives you a dictionary like interface.
 http://www.python.org/doc/lib/module-bsddb.html

 Standard SQL databases can work for this, but generally your
 recommendation of using bsddb works very well for int - int mappings.
 In particular, I would suggest using a btree, if only because I have had
 troubles in the past with colliding keys in the bsddb.hash (and recno is
 just a flat file, and will attempt to create a file i*(record size) to
 write to record number i .

 As an alternative, there are many search-engine known methods for
 mapping int - [int, int, ...], which can be implemented as int - int,
 where the second int is a pointer to an address on disk.  Looking into a
 few of the open source search implementations may be worthwhile.

Thanks guys! I will look into bsddb, hopefully this doesn't keep all
keys in memory, i couldn't find answer to that during my (very brief)
look into the documentation.

And how about the extra memory used for set/dict'ing of integers? Is
there a simple answer?

-- 
http://mail.python.org/mailman/listinfo/python-list


memory efficient set/dictionary

2007-06-10 Thread koara
What is the best to go about using a large set (or dictionary) that
doesn't fit into main memory? What is Python's (2.5 let's say)
overhead for storing int in the set, and how much for storing int -
int mapping in the dict?

Please recommend a module that allows persistent set/dict storage +
fast query that best fits my problem, and as lightweight as possible.
For queries, the hit ratio is about 10%. Fast updates would be nice,
but i can rewrite the algo so that the data is static, so update speed
is not critical.

Or am i better off not using Python here? Cheers.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: memory efficient set/dictionary

2007-06-10 Thread koara
Hello Steven,

On Jun 10, 5:29 pm, Steven D'Aprano
[EMAIL PROTECTED] wrote:
  ...
 How do you know it won't fit in main memory if you don't know the
 overhead? A guess? You've tried it and your computer crashed?

exactly

  Please recommend a module that allows persistent set/dict storage +
  fast query that best fits my problem,

 Usually I love guessing what people's problems are before making a
 recommendation, but I'm feeling whimsical so I think I'll ask first.

 What is the problem you are trying to solve? How many keys do you have?

Corpus processing. There are in the order of billions to tens of
billions keys (64bit integers).

 Can you group them in some way, e.g. alphabetically? Do you need to search
 on random keys, or can you queue them and do them in the order of your
 choice?


Yes, keys in sets and dictionaries can be grouped in any way, order
doesn't matter. Not sure what you mean.
Yes, i need fast random access (at least i do without having to
rethink and rewrite everything, which is what i'd like to avoid with
the help of this thread :-)

Thanks for the reply!


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: find_longest_match in SequenceMatcher

2006-07-25 Thread koara
Hello again John -- your hack/fix seems to work. Thanks a lot, now
let's hope timbot will indeed be here shortly with a proper fix =)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: find_longest_match in SequenceMatcher

2006-07-24 Thread koara
John Machin wrote:
 --test results snip---
 Looks to me like the problem has nothing at all to do with the length
 of the searched strings, but a bug appeared in 2.3.  What version(s)
 were you using? Can you reproduce your results (500  499 giving
 different answers) with the same version?

Hello John, thank you for investigating and responding!

Yes, I can reproduce the behaviour with different results within the
same version -- which is 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310
32 bit (Intel)]

The catch is to remove the last character, as i described in my
original post, as opposed to passing reduced length parameters to
find_longest_match, which is what you did.

It is morning now, but i still fail to see the mistake i am making --
if it is indeed a bug, where do i report it? 

Cheers!

-- 
http://mail.python.org/mailman/listinfo/python-list


find_longest_match in SequenceMatcher

2006-07-23 Thread koara
Hello, it might be too late or too hot, but i cannot work out this
behaviour of find_longest_match() in difflib.SequenceMatcher:

string1:
releasenotesforwildmagicversion01thiscdromcontainstheinitialreleaseofthesourcecodethataccompaniesthebook3dgameenginedesign:apracticalapproachtorealtimecomputergraphicsthereareanumberofknownissuesaboutthecodeastheseissuesareaddressedtheupdatedcodewillbeavailableatthewebsitehttp://wwwmagicsoftwarecom/[EMAIL
 PROTECTED]

string2:
releasenotesforwildmagicversion02updatefromversion01toversion02ifyourcopyofthebookhasversion01andifyoudownloadedversion02fromthewebsitethenapplythefollowingdirectionsforinstallingtheupdateforalinuxinstallationseethesectionattheendofthisdocumentupdatedirectionsassumingthatthetopleveldirectoryiscalledmagicreplacebyyourtoplevelnameyoushouldhavetheversion01contentsinthislocation1deletethecontentsofmagic\include2deletethesubdirectorymagic\source\mgcapplication3deletetheobsoletefiles:amagic\source\mgc

find_longest_match(0,500,0,500)=(24,43,10)=version01t

What? O_o Clearly there is a longer match, right at the beginning!
And then, after removal of the last character from each string (i found
the limit of 500 by trial and error -- and it looks suspiciously
rounded):

find_longest_match(0,499,0,499)=(0,0,32)=releasenotesforwildmagicversion0


Is this the expected behaviour? What's going on?
Thank you for any ideas

-- 
http://mail.python.org/mailman/listinfo/python-list