[issue13653] reorder set.intersection parameters for better performance

2019-08-25 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

The anti-correlation algorithm seems like a plausible heuristic but it can't 
really know more than the user does about the semantic content of the sets.  
Also as Terry pointed out, this will have user visible effects.

This likely should be published as a blog post or recipe.  A user can already 
control the pairing order and do this or something like it themselves.  It is 
more of a technique for using sets that it is a core set algorithm (it reminds 
me of using associativity to optimize chained matrix multiplications, though 
than can be done precisely rather than heuristically).

--
resolution:  -> rejected
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34151] use malloc() for better performance of some list operations

2018-08-11 Thread Xiang Zhang


Change by Xiang Zhang :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34151] use malloc() for better performance of some list operations

2018-08-11 Thread Xiang Zhang


Xiang Zhang  added the comment:


New changeset 2fc46979b8c802675ca7fd51c6f2108a305001c8 by Xiang Zhang (Sergey 
Fedoseev) in branch 'master':
bpo-34151: Improve performance of some list operations (GH-8332)
https://github.com/python/cpython/commit/2fc46979b8c802675ca7fd51c6f2108a305001c8


--
nosy: +xiang.zhang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34151] use malloc() for better performance of some list operations

2018-07-19 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34151] use malloc() for better performance of some list operations

2018-07-18 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

+1 This looks like a reasonable improvement.

--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34151] use malloc() for better performance of some list operations

2018-07-18 Thread Stefan Behnel


Stefan Behnel  added the comment:

Nice! Patch looks good to me, minus the usual naming nit-pick.

--
nosy: +scoder
versions: +Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34151] use malloc() for better performance of some list operations

2018-07-18 Thread Sergey Fedoseev


Change by Sergey Fedoseev :


--
keywords: +patch
pull_requests: +7869
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34151] use malloc() for better performance of some list operations

2018-07-18 Thread Sergey Fedoseev


New submission from Sergey Fedoseev :

Currently list concatenation, slicing and repeating operations are using 
PyList_New() which allocates memory for the items by calloc(). malloc() could 
be used instead, since the allocated memory is overwritten by mentioned 
operations.

I made benchmarks with this script:

NAME=list-malloc-master.json

python -m perf timeit --name slice0 -s "l = [None]*100" "l[:0]" 
--duplicate=2048 --append $NAME
python -m perf timeit --name slice1 -s "l = [None]*100" "l[:1]" 
--duplicate=1024 --append $NAME
python -m perf timeit --name slice2 -s "l = [None]*100" "l[:2]" 
--duplicate=1024 --append $NAME
python -m perf timeit --name slice3 -s "l = [None]*100" "l[:3]" 
--duplicate=1024 --append $NAME
python -m perf timeit --name slice100 -s "l = [None]*100" "l[:100]" 
--append $NAME

python -m perf timeit --name cat0 -s "l = [None]*0" "l + l" --duplicate=1024 
--append $NAME
python -m perf timeit --name cat1 -s "l = [None]*1" "l * 1" --duplicate=1024 
--append $NAME
python -m perf timeit --name cat2 -s "l = [None]*2" "l * 1" --duplicate=1024 
--append $NAME
python -m perf timeit --name cat3 -s "l = [None]*3" "l * 1" --duplicate=1024 
--append $NAME
python -m perf timeit --name cat100 -s "l = [None]*100" "l * 1" 
--append $NAME

python -m perf timeit --name 1x0 -s "l = [None]" "l * 0" --duplicate=1024 
--append $NAME
python -m perf timeit --name 1x1 -s "l = [None]" "l * 1" --duplicate=1024 
--append $NAME
python -m perf timeit --name 1x2 -s "l = [None]" "l * 2" --duplicate=1024 
--append $NAME
python -m perf timeit --name 1x3 -s "l = [None]" "l * 3" --duplicate=1024 
--append $NAME
python -m perf timeit --name 1x100 -s "l = [None]" "l * 100" --append 
$NAME 


Here's comparison table:

+--++--+
| Benchmark| list-malloc-master | list-malloc  |
+==++==+
| slice1   | 84.5 ns| 59.6 ns: 1.42x faster (-30%) |
+--++--+
| slice2   | 71.6 ns| 61.8 ns: 1.16x faster (-14%) |
+--++--+
| slice3   | 74.4 ns| 63.6 ns: 1.17x faster (-15%) |
+--++--+
| slice100 | 4.39 ms| 4.08 ms: 1.08x faster (-7%)  |
+--++--+
| cat0 | 23.9 ns| 24.9 ns: 1.04x slower (+4%)  |
+--++--+
| cat1 | 73.2 ns| 51.9 ns: 1.41x faster (-29%) |
+--++--+
| cat2 | 61.6 ns| 53.1 ns: 1.16x faster (-14%) |
+--++--+
| cat3 | 63.0 ns| 54.3 ns: 1.16x faster (-14%) |
+--++--+
| cat100   | 4.38 ms| 4.08 ms: 1.07x faster (-7%)  |
+--++--+
| 1x0  | 27.1 ns| 27.7 ns: 1.02x slower (+2%)  |
+--++--+
| 1x1  | 72.9 ns| 51.9 ns: 1.41x faster (-29%) |
+--++--+
| 1x2  | 60.9 ns| 52.9 ns: 1.15x faster (-13%) |
+--++--+
| 1x3  | 62.5 ns| 54.8 ns: 1.14x faster (-12%) |
+--++--+
| 1x100    | 2.67 ms    | 2.34 ms: 1.14x faster (-12%) |
+--++--+

Not significant (1): slice0

--
components: Interpreter Core
messages: 321905
nosy: sir-sigurd
priority: normal
severity: normal
status: open
title: use malloc() for better performance of some list operations
type: performance

___
Python tracker 
<https://bugs.python.org/issue34151>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-31 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-31 Thread Raymond Hettinger

Raymond Hettinger added the comment:

I recommend rejecting this proposal

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-30 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I don't think I've ever used `in` on an iterator.  I didn't even expect it to 
work, and would not consider its use a good practice.

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-04 Thread Terry J. Reedy

Terry J. Reedy added the comment:

Sorry, I mistakenly assumed, without carefully checking the C code, that the 
speedup was from checking the underlying collection, without advancing the 
iterator.  I presume that " ++it->it_index;" is the statement to the contrary. 
I should have either asked or found this sooner.  I unlinked the noisy comment.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-04 Thread Terry J. Reedy

Changes by Terry J. Reedy :


--
Removed message: http://bugs.python.org/msg299770

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-04 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Terry, this proposition doesn't change the behavior. It moves the iterator 
forward during searching. The only effect is inlining __next__ in a loop and 
getting rid of the overhead of few indirections and calls.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-04 Thread Terry J. Reedy

Terry J. Reedy added the comment:

On my particular Win 10 machine with 3.6, the times are 1.52 and 2.14.  But to 
me, the times are irrelevant.  A performance enhancement, by definition, should 
not change computational results, but this does. If an iterator is a standard 
iterator, then 'x in iterator' runs the iterator until x is found or iterator 
is exhausted.  The only current code that properly uses this operation on an 
iterator must be depending on the behavior.

So while the change might fix some buggy code, it would break good code and 
divide 'iterator' into two subgroups: standard iterator and contained iterator. 
 That is turn would require us to documents that while generators remain 
standard iterators, builtin collection iterators were now contained iterators.  
This is confusion we should not inflict on users.

This should be rejected, as we have done with all other proposals to add 
features to iterators that only apply to a subset.  (The apparently exception, 
__length_hint__ was defined to not necessarily be accurate, so that there is 
effectively a default __length_hint__ for all iterators, lambda self: n, where 
n depends on the caller.)

--
nosy: +terry.reedy
type: performance -> enhancement

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-03 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

The patch adds almost 40 line of the code and increases the performance of not 
well famous feature at best by 10-20%. Adding an optimization for every new 
iterator type will add a comparable quantity of the code. I think this is too 
high cost.

Using a common template implementation for iterators (issue27438) would 
decrease the relative cost of this feature.

--
components: +Interpreter Core
nosy: +rhettinger, serhiy.storchaka
stage:  -> patch review
versions: +Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-02 Thread Sergey Fedoseev

New submission from Sergey Fedoseev:

> python -mtimeit -s "l = list(range(10))" "l[-1] in l"
1000 loops, best of 3: 1.34 msec per loop
> python -mtimeit -s "l = list(range(10))" "l[-1] in iter(l)"   
>  
1000 loops, best of 3: 1.59 msec per loop

--
messages: 299666
nosy: sir-sigurd
priority: normal
severity: normal
status: open
title: add __contains__ for list_iterator (and others) for better performance
type: performance

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue31108>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31108] add __contains__ for list_iterator (and others) for better performance

2017-08-02 Thread Sergey Fedoseev

Changes by Sergey Fedoseev :


--
pull_requests: +3025

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13653] reorder set.intersection parameters for better performance

2012-01-02 Thread Jesús Cea Avión

Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13653
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13653] reorder set.intersection parameters for better performance

2011-12-23 Thread Andrew Dalke

Andrew Dalke da...@dalkescientific.com added the comment:

My belief is that the people who use set.intersection with more than two terms 
are 1) going to pass in a list of sets, and 2) don't care about the specific 
order.

To check the validity of my belief, I did a Google Code Search to find cases of 
people using set intersection in Python. I searched for set\.intersection\(\* 
and \.intersection\(.*\, lang:^python$, among others.

I am sad to report that the most common way to compute set.intersection(*list) 
is by using reduce, like:

possible = (set(index[c]) for c in set(otp))
possible = reduce(lambda a, b: a.intersection(b), possible)


That comes from:
  git://github.com/Kami/python-yubico-client.git /yubico/modhex.py
and similar uses are in:
  git://github.com/sburns/PyCap.git /redcap/rc.py
  http://hltdi-l3.googlecode.com/hg//xdg/languages/morpho/fst.py
  http://dsniff.googlecode.com/svn/trunk/dsniff/lib/fcap.py


As well as in the Rosetta Code example for a simple inverted index, at:
  http://rosettacode.org/wiki/Inverted_index#Python

This was also implemented more verbosely in:

http://eats.googlecode.com/svn/trunk/server/eats/views/main.py
intersected_set = sets[0]
for i in range(1, len(sets)):
intersected_set = intersected_set.intersection(sets[i])

and 

http://iocbio.googlecode.com/svn/trunk/iocbio/microscope/cluster_tools.py
s = set (range (len (data[0])))
for d in zip(*data):
s = s.intersection(set(find_outliers(d, zoffset=zoffset)))
return sorted(s)

In other words, 7 codebases use manual pairwise reduction rather than use the 
equivalent code in Python. (I have not checked for which are due to backwards 
compatibility requirements.)

On the other hand, if someone really wants to have a specific intersection 
order, this shows that it's very easy to write.


I found 4 other code bases where set intersection was used for something other 
than binary intersection, and used the built-in intersection().



git://github.com/valda/wryebash.git/experimental/bait/bait/presenter/impl/filters.py
def get_visible_node_ids(self, filterId):
if filterId in self.idMask:
visibleNodeIdSets = [f.get_visible_node_ids(filterId) for f in 
self._filters]
return set.intersection(*[v for v in visibleNodeIdSets if v is not 
None])
return None



http://wallproxy.googlecode.com/svn/trunk/local/proxy.py
if threads[ct].intersection(*threads.itervalues()):
raise ValueError('All threads failed')
(here, threads' values contain sets)



git://github.com/argriffing/xgcode.git/20100623a.py
header_sets = [set(x) for x in header_list]
header_intersection = set.intersection(*header_sets)




http://pyvenn.googlecode.com/hg//venn.py
to_exclude = set()
for ii in xrange(0, len(self.sets)):
if (i  (2**ii)):
sets_to_intersect.append(sets_by_power_of_two[i  (2**ii)])
else:
to_exclude = to_exclude.union(sets_by_power_of_two[(2**ii)])
final = set.intersection(*sets_to_intersect) - to_exclude



These all find the intersection of sets (not iterators), and the order of 
evaluation does not appear like it will affect the result.

I do not know though if there will be a performance advantage in these cases to 
reordering. I do know that in my code, and any inverted index, there is an 
advantage.

And I do know that the current CPython implementation has bad worst-case 
performance.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13653
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13653] reorder set.intersection parameters for better performance

2011-12-23 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

Given that equality is not identify, order does matter, although in 3.2.2 the 
results are the opposite of what one might expect.

a = set((1,2,3))
b = set((1.0, 3.0, 5.0))
print(ab, ba)
print(a.intersection(b), b.intersection(a))
a = b
print(a)
 
{1.0, 3.0} {1, 3}
{1.0, 3.0} {1, 3}
{1.0, 3.0}

In my view, a = b should remove the members of a that are not in b, rather 
than deleting all and replacing some with equal members of b.

That glitch aside,  remains and remains binary for exact control of order. The 
doc should just say that intersection may re-order the intersection for 
efficiency.

--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13653
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13653] reorder set.intersection parameters for better performance

2011-12-22 Thread Andrew Dalke

New submission from Andrew Dalke da...@dalkescientific.com:

In Issue3069, Arnaud Delobelle proposed support for multiple values to 
set.intersection() and set.union(), writing Intersection is optimized by 
sorting all sets/frozensets/dicts in increasing order of size and only 
iterating over elements in the smallest.

Raymond Hettinger commented therein that he had just added support for multiple 
parameters. However, he did not pick up the proposed change in the attached 
patch which attempts to improve the intersection performance.

Consider the attached benchmark, which constructs an inverted index mapping a 
letter to the set of words which contain that letter. (Rather, to word index.) 
Here's the output:

## Example output:
# a has 144900 words
# j has 3035 words
# m has 62626 words
# amj takes 5.902/1000 (verify: 289)
# ajm takes 0.292/1000 (verify: 289)
# jma takes 0.132/1000 (verify: 289)


Searching set.intersection(inverted_index[j], inverted_index[m], 
inverted_index[a]) is fully 44 times faster than searching a, m, j!

Of course, the set.intersection() supports any iterable, so would only be an 
optimization for when all of the inputs are set types.

BTW, my own experiments suggest that sorting isn't critical. It's more 
important to find the most anti-correlated set to the smallest set, and the 
following does that dynamically by preferentially choosing sets which are 
likely to not match elements of the smallest set:

def set_intersection(*input_sets):
N = len(input_sets)
min_index = min(range(len(input_sets)), key=lambda x: len(input_sets[x]))
best_mismatch = (min_index+1)%N

new_set = set()
for element in input_sets[min_index]:
# This failed to match last time; perhaps it's a mismatch this time?
if element not in input_sets[best_mismatch]:
continue

# Scan through the other sets
for i in range(best_mismatch+1, best_mismatch+N):
j = i % N
if j == min_index:
continue
# If the element isn't in the set then perhaps this
# set is a better rejection test for the next input element
if element not in input_sets[j]:
best_mismatch = j
break
else:
# The element is in all of the other sets
new_set.add(element)
return new_set


Using this in the benchmark gives

amj takes 0.972/1000 (verify: 289)
ajm takes 0.972/1000 (verify: 289)
jma takes 0.892/1000 (verify: 289)

which clearly shows that this Python algorithm is still 6 times faster (for the 
worst case) than the CPython code.

However, the simple sort solution:


def set_intersection_sorted(*input_sets):
input_sets = sorted(input_sets, key=len)
new_set = set()
for element in input_sets[0]:
if element in input_sets[1]:
if element in input_sets[2]:
new_set.add(element)
return new_set

gives times of 

amj takes 0.492/1000 (verify: 289)
ajm takes 0.492/1000 (verify: 289)
jma takes 0.422/1000 (verify: 289)

no doubt because there's much less Python overhead than my experimental 
algorithm.

--
components: Interpreter Core
files: set_intersection_benchmark.py
messages: 150124
nosy: dalke
priority: normal
severity: normal
status: open
title: reorder set.intersection parameters for better performance
type: enhancement
versions: Python 3.4
Added file: http://bugs.python.org/file24081/set_intersection_benchmark.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13653
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13653] reorder set.intersection parameters for better performance

2011-12-22 Thread Benjamin Peterson

Changes by Benjamin Peterson benja...@python.org:


--
assignee:  - rhettinger
nosy: +rhettinger

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13653
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13653] reorder set.intersection parameters for better performance

2011-12-22 Thread Raymond Hettinger

Raymond Hettinger raymond.hettin...@gmail.com added the comment:

Thanks guys.  I'll look at this in detail when I get a chance.  Offhand, it 
seems like a good idea though it may rarely be of benefit.  The only downsides 
I see are that it overrides the user's ability to specify the application order 
and that it would be at odds with proposals to implement ordered sets 
(including a guaranteed order of application in intersection, union, 
difference, etc).

--
priority: normal - low

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13653
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Better performance

2008-06-03 Thread Paul Boddie
On 2 Jun, 20:16, David [EMAIL PROTECTED] wrote:

 PHP: Easy to make web pages.
 Perl: Lots of libraries, good text processing support
 Python: Easy to read and maintain

PHP: For the security vulnerabilities.
Perl: For the maintenance problem.
Python: To the rescue!

;-)

 You could even use all 3, in their strong areas. eg, PHP for the web
 page, perl for quick  dirty text-processing utils, and Python for the
 longer, more complicated scripts which you have to maintain.

Yes, it depends on how big the problem really is and how complicated
the solution has to be. Are the files all in the same format? Do they
have to be parsed repeatedly? Will new versions of the files be
released? Will the parser output go straight to a Web page or will it
be stored somewhere?

Paul
--
http://mail.python.org/mailman/listinfo/python-list


Re: Better performance

2008-06-02 Thread Arnaud Delobelle
Franck Y [EMAIL PROTECTED] writes:

 Hello Folks,

 I am facing a problem where i need to parse around 200 files, i have a
 bit of knowledge in PHP/Perl/Python (the magic P :-P)

 Which one would you suggest me since i have to generate a web
 interface ?
 And each one has his  area of 'work'


 Thanks for your help !

Python, of course.

-- 
Arnaud
--
http://mail.python.org/mailman/listinfo/python-list


Re: Better performance

2008-06-02 Thread Bruno Desthuilliers

Franck Y a écrit :

Hello Folks,

I am facing a problem where i need to parse around 200 files, i have a
bit of knowledge in PHP/Perl/Python (the magic P :-P)

Which one would you suggest me since i have to generate a web
interface ?
And each one has his  area of 'work'



And where's your performance problem ?

You don't give enough details to seriously answer your question, but 
Python is a good tool for file parsing and web development anyway...

--
http://mail.python.org/mailman/listinfo/python-list


Re: Better performance

2008-06-02 Thread David
On Mon, Jun 2, 2008 at 4:42 AM, Franck Y [EMAIL PROTECTED] wrote:
 Hello Folks,

 I am facing a problem where i need to parse around 200 files, i have a
 bit of knowledge in PHP/Perl/Python (the magic P :-P)


Trite answer: Use whatever is going to work best in your circumstances.

All 3 languages can do what you need.

Strong points for the languages as related to your question:

PHP: Easy to make web pages.
Perl: Lots of libraries, good text processing support
Python: Easy to read and maintain

You could even use all 3, in their strong areas. eg, PHP for the web
page, perl for quick  dirty text-processing utils, and Python for the
longer, more complicated scripts which you have to maintain.

David.
--
http://mail.python.org/mailman/listinfo/python-list


Better performance

2008-06-01 Thread Franck Y
Hello Folks,

I am facing a problem where i need to parse around 200 files, i have a
bit of knowledge in PHP/Perl/Python (the magic P :-P)

Which one would you suggest me since i have to generate a web
interface ?
And each one has his  area of 'work'


Thanks for your help !


--
http://mail.python.org/mailman/listinfo/python-list


Better performance

2008-06-01 Thread Franck Y
Hello Folks,

I am facing a problem where i need to parse around 200 files, i have a
bit of knowledge in PHP/Perl/Python (the magic P :-P)

Which one would you suggest me since i have to generate a web
interface ?
And each one has his  area of 'work'


Thanks for your help !


--
http://mail.python.org/mailman/listinfo/python-list