Re: Google Code Jam language usage

2009-09-15 Thread Bearophile
> (There can be ways to speed up this Python code, I have not tried to
> use a 1D matrix with shifts to find the right starting of the rows as
> in C, and often in such dynamic programming algorithms you can just
> keep 2 rows to avoid storing the whole dynamic matrix, this saves
> memory and speed up code because there are less cache misses.


It was easy to do, running time about 0.13 seconds:


from array import array

def main():
b = "welcome to code jam"
len_b = len(b)
di = array('l', [0]) * len_b
dim1 = array('l', [0]) * len_b
for t in xrange(int(raw_input())):
a = raw_input()
for j in xrange(len_b): # set to 0
di[j] = 0
dim1[j] = 0
if a[0] == b[0]:
   dim1[0] = 1
for i in xrange(1, len(a)):
for j in xrange(len_b): # set to 0
di[j] = 0
di[0] += dim1[0]
if b[0] == a[i]:
di[0] += 1
for j in xrange(1, len_b):
di[j] += dim1[j]
if b[j] == a[i]:
di[j] += dim1[j - 1]
di[j] %= 1
di, dim1 = dim1, di
print "Case #%d: %04d" % (t+1, dim1[len_b - 1])

import psyco; psyco.full()
main()

Bye,
bearophile
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Google Code Jam language usage

2009-09-15 Thread Bearophile
(If you see any error in what I have written here, please tell me.)

So far I've never done a Google Code Jam. There are 12 problems there
(G.C.Jam 2009), and the best solutions are 9 in C++ and 3 in C (I
think they are the best solutions, but I am not sure).
The code of all such best solutions is tidy, well formatted, and
written in a quite compact style, with little extra space, few lines
of code, and short variable names (often 1 or 2 chars long).

The C++ code by TripleM shows he/she is also a topcoder, such code is
written in an aggressive style: they show a standard header where all
data structures from C++ STL are imported, plus some handy and common
templates/functions (with vector, matrix, etc, operations). I think it
may be positive to add similar handy templates/functions/data
structures to Boost or the STL. For example I've lost count of how
many times I have seen reimplemented a vector class/struct with dot
products, etc.

Generally in such contests programs are not that similar to real-world
ones, for example inputs ranges/sizes are perfectly known. But
probably this makes such contests "better".

The plots show that Python programmers seem to produce a higher
percentage of working programs compared to C ones, curiously Pascal
programmers have an higher percentage still. As usual Haskell
programmers look smart.

In other similar contests Python often doesn't allow to produce the
best solutions because it's too much slow at runtime, even using
Psyco. For example in the Code Jam the large problems may require lot
of processing. But experience shows that even for such algorithmic
problems the choice of language isn't the most important thing. The
most important is the choice of a good algorithm, the choice of good
data structures, the use of as little dynamic memory allocations as
possible, and then only now the language becomes important for the
running time.

Problems in such contests are often designed with well defined input
ranges/sizes, and dynamic allocations are one of the slower things in
such programs. So when possible it's better to use static memory, for
example a statically allocated matrix in C.

Python encourages the programmer to use a programming style where
dynamic allocations of memory are very common, but when performance
matters it's better to reduce them to the minimun in the inner loops.

For example this is (I think) the best Python program for the "Welcome
to Code Jam" qualification problem:

# By ZIBADA
import sys
import psyco
psyco.full()
def isset(a, b): return ((a >> b) & 1) != 0
def dbg(a): sys.stderr.write(str(a))
def readint(): return int(raw_input())
def readfloat(): return float(raw_input())
def readarray(foo): return [foo(x) for x in raw_input().split()]

def doit(i, j):
if (i, j) in mem: return mem[(i, j)]
if (j >= len(b)): return 1
if (i >= len(a)): return 0
res = doit(i + 1, j)
if (a[i] == b[j]):
res += doit(i + 1, j + 1)
res %= 1
mem[(i, j)] = res
return res

def run_test(test):
#dbg("Test %d\n" % (test + 1))
global a, b, mem
a = raw_input()
b = "welcome to code jam"
mem = {}

print "Case #%d: %04d" % (test + 1, doit(0, 0))

for test in range(readint()):
run_test(test)


It uses dynamic allocation and recursive calls. It processes the large
input in 0.56 s on my old PC.
The following is instead the starting part of (I think) the best
overall solution, that is in C (the algorithm used is a basic and well
known one, no recursions used. Run time is very fast, I can't time
it):

// By LayCurse
char a[512];
char b[32] = "welcome to code jam";
int d[512][32];
int main() {
...
for (t = 1;t <= tc; t++) {
fgets(a, 505, fp);
memset(d, 0, sizeof(d));
...


It uses a static matrix, and the number of columns is 32 instead of 19
(= len("welcome to code jam")) so the compiler can use just a shift to
find the right row start. One memset clears the whole matrix for each
input string.

This is the same C code by LayCurse translated (with few small
improvements) to Python+Psyco, it runs in about 0.14 seconds here on
the same dataset. Using the right algorithm, right data structures,
and minimizing memory allocations, with Psyco you can often go fast
enough:


from array import array
def main():
b = "welcome to code jam"
len_b = len(b)
d = [array('l', [0]) * len_b for _ in xrange(512)]
for t in xrange(int(raw_input())):
a = raw_input()
for row in d: # set d to 0
for j in xrange(len_b):
row[j] = 0
if a[0] == b[0]:
d[0][0] = 1
for i in xrange(1, len(a)):
d[i][0] += d[i - 1][0]
if b[0] == a[i]:
d[i][0] += 1
for j in xrange(1, len_b):
d[i][j] += d[i - 1][j]
if b[j] == a[i]:
d[i][j] += d[i - 1][j - 1]
d[i][j] %= 1000

Re: Google Code Jam language usage

2009-09-14 Thread Stefan Behnel
Andreas Waldenburger wrote:
> On Mon, 14 Sep 2009 11:17:02 -0400 Terry Reedy wrote:
> 
>> At the recent Google Code Jam, Python was the 3rd most popular
>> language after C/C++ and Java. 
> 
> Good for C/C++ and Java that they are not ranked by "fun per line".

Oh, I actually tend to have a lot of fun per line with Java. But that's
usually not with code I have written myself.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Google Code Jam language usage

2009-09-14 Thread Andreas Waldenburger
On Mon, 14 Sep 2009 11:17:02 -0400 Terry Reedy  wrote:

> At the recent Google Code Jam, Python was the 3rd most popular
> language after C/C++ and Java. 

Good for C/C++ and Java that they are not ranked by "fun per line".

/W


-- 
INVALID? DE!

-- 
http://mail.python.org/mailman/listinfo/python-list