Re: [Tutor] longest common substring

2011-11-13 Thread lina
On Mon, Nov 14, 2011 at 11:56 AM, lina  wrote:
> On Mon, Nov 14, 2011 at 6:28 AM, Andreas Perstinger
>  wrote:
>> On 2011-11-11 14:44, lina wrote:
>>>
>>> You are right, I did not think of this parts before. and actually the
>>> initiative wish was to find possible paths, I mean, possible
>>> substrings, all possible substrings. not the longest one, but at
>>> least bigger than 3.
>>
>> I had some time today and since you have changed your initial task (from
>> finding the longest common path to finding all common paths with a minimum
>> length) I've modified the code and came up with the following solution:
>>
>> def AllCommonPaths(list1, list2, minimum=3):
>> """ finds all common paths with a minimum length (default = 3)"""
>>
>>    # First we have to initialize the necessary variables:
>>    # M is an empty table where we will store all found matches
>>    # (regardless of their length)
>>
>>    M = [[0] * (len(list2)) for i in range(len(list1))]
>>
>>    # length is a dictionary where we store the length of each common
>>    # path. The keys are the starting positions ot the paths in list1.
>>
>>    length = {}
>>
>>    # result will be a list of of all found paths
>>
>>    result =[]
>>
>>    # Now the hard work begins:
>>    # Each element of list1 is compared to each element in list2
>>    # (x is the index for list1, y is the index for list2).
>>    # If we find a match, we store the distance to the starting point
>>    # of the matching block. If we are in the left-most column (x == 0)
>>    # or in the upper-most row (y == 0) we have to set the starting
>>    # point ourself because we would get negative indexes if we look
>>    # for the predecessor cell (M[x - 1][y - 1]). Else, we are one
>>    # element farther away as the element before, so we add 1 to its
>>    # value.
>>
>>    for x in range(len(list1)):
>>        for y in range(len(list2)):
>>            if list1[x] == list2[y]:
>>                if (x == 0) or (y == 0):
>>                    M[x][y] = 1
>>                else:
>>                    M[x][y] = M[x - 1][y - 1] + 1
>>
>>    # To get everything done in one pass, we update the length of
>>    # the found path in our dictionary if it is longer than the minimum
>>    # length. Thus we don't have to get through the whole table a
>>    # second time to get all found paths with the minimum length (we
>>    # don't know yet if we are already at the end of the matching
>>    # block).
>>
>>                if M[x][y] >= minimum:
>>                    length[x + 1 - M[x][y]] = M[x][y]
>>
>>
>>    # We now have for all matching blocks their starting
>>    # position in list1 and their length. Now we cut out this parts
>>    # and create our resulting list

How silly I was, it's nothing to do with x,y, since I used i and j,
it's crystal clear.

Thanks again for your time,

Best regards,

lina
>
> This is a very smart way to store their starting position as a key. My
> mind was choked about how to save the list as a key before.
>
>>
>>    for pos in length:
>>        result.append(list1[pos:pos + length[pos]])
>>
>>    return result
>>
>> I've tried to explain what I have done, but I'm sure you will still have
>> questions :-).
>
> I am confused myself with this matrix/array, about how to define
> x-axis, y-axis.
>
> I must understand some parts wrong, for the following:
>>
>> Is this close to what you want?
>>
>> Bye, Andreas
>>
>> PS: Here's the function again without comments:
>>
>> def AllCommonPaths(list1, list2, minimum=3):
>>    """ finds all common paths with a minimum length (default = 3)"""
>>
>>    M = [[0] * (len(list2)) for i in range(len(list1))]
>
> is it correct that the list2 as the x-axis, the list1 as y-axis:?
>
>>    length = {}
>>    result =[]
>>
>>    for x in range(len(list1)):
>
> Here for each row ,
>
>>        for y in range(len(list2)):
>
> This loop go through each column of certain row then,
>
>>            if list1[x] == list2[y]:
>>                if (x == 0) or (y == 0):
>>                    M[x][y] = 1
>
> Here M[x][y] actually means the x-row? y-column, seems conflicts with
> the x-axis and y-axis. they took y-axis as x row, x-axis as y column.
>
>>                else:
>>                    M[x][y] = M[x - 1][y - 1] + 1
>>                if M[x][y] >= minimum:
>>                    length[x + 1 - M[x][y]] = M[x][y]
>>
>>    for pos in length:
>>        result.append(list1[pos:pos + length[pos]])
>>
>>    return result
>
> I have no problem understanding the other parts, except the array and
> axis entangled in my mind.
>
>> ___
>> Tutor maillist  -  Tutor@python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>>
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-13 Thread lina
On Mon, Nov 14, 2011 at 6:28 AM, Andreas Perstinger
 wrote:
> On 2011-11-11 14:44, lina wrote:
>>
>> You are right, I did not think of this parts before. and actually the
>> initiative wish was to find possible paths, I mean, possible
>> substrings, all possible substrings. not the longest one, but at
>> least bigger than 3.
>
> I had some time today and since you have changed your initial task (from
> finding the longest common path to finding all common paths with a minimum
> length) I've modified the code and came up with the following solution:
>
> def AllCommonPaths(list1, list2, minimum=3):
> """ finds all common paths with a minimum length (default = 3)"""
>
>    # First we have to initialize the necessary variables:
>    # M is an empty table where we will store all found matches
>    # (regardless of their length)
>
>    M = [[0] * (len(list2)) for i in range(len(list1))]
>
>    # length is a dictionary where we store the length of each common
>    # path. The keys are the starting positions ot the paths in list1.
>
>    length = {}
>
>    # result will be a list of of all found paths
>
>    result =[]
>
>    # Now the hard work begins:
>    # Each element of list1 is compared to each element in list2
>    # (x is the index for list1, y is the index for list2).
>    # If we find a match, we store the distance to the starting point
>    # of the matching block. If we are in the left-most column (x == 0)
>    # or in the upper-most row (y == 0) we have to set the starting
>    # point ourself because we would get negative indexes if we look
>    # for the predecessor cell (M[x - 1][y - 1]). Else, we are one
>    # element farther away as the element before, so we add 1 to its
>    # value.
>
>    for x in range(len(list1)):
>        for y in range(len(list2)):
>            if list1[x] == list2[y]:
>                if (x == 0) or (y == 0):
>                    M[x][y] = 1
>                else:
>                    M[x][y] = M[x - 1][y - 1] + 1
>
>    # To get everything done in one pass, we update the length of
>    # the found path in our dictionary if it is longer than the minimum
>    # length. Thus we don't have to get through the whole table a
>    # second time to get all found paths with the minimum length (we
>    # don't know yet if we are already at the end of the matching
>    # block).
>
>                if M[x][y] >= minimum:
>                    length[x + 1 - M[x][y]] = M[x][y]
>
>
>    # We now have for all matching blocks their starting
>    # position in list1 and their length. Now we cut out this parts
>    # and create our resulting list

This is a very smart way to store their starting position as a key. My
mind was choked about how to save the list as a key before.

>
>    for pos in length:
>        result.append(list1[pos:pos + length[pos]])
>
>    return result
>
> I've tried to explain what I have done, but I'm sure you will still have
> questions :-).

I am confused myself with this matrix/array, about how to define
x-axis, y-axis.

I must understand some parts wrong, for the following:
>
> Is this close to what you want?
>
> Bye, Andreas
>
> PS: Here's the function again without comments:
>
> def AllCommonPaths(list1, list2, minimum=3):
>    """ finds all common paths with a minimum length (default = 3)"""
>
>    M = [[0] * (len(list2)) for i in range(len(list1))]

is it correct that the list2 as the x-axis, the list1 as y-axis:?

>    length = {}
>    result =[]
>
>    for x in range(len(list1)):

Here for each row ,

>        for y in range(len(list2)):

This loop go through each column of certain row then,

>            if list1[x] == list2[y]:
>                if (x == 0) or (y == 0):
>                    M[x][y] = 1

Here M[x][y] actually means the x-row? y-column, seems conflicts with
the x-axis and y-axis. they took y-axis as x row, x-axis as y column.

>                else:
>                    M[x][y] = M[x - 1][y - 1] + 1
>                if M[x][y] >= minimum:
>                    length[x + 1 - M[x][y]] = M[x][y]
>
>    for pos in length:
>        result.append(list1[pos:pos + length[pos]])
>
>    return result

I have no problem understanding the other parts, except the array and
axis entangled in my mind.

> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-13 Thread Andreas Perstinger

On 2011-11-11 16:53, Jerry Hill wrote:

There's nothing wrong with writing your own code to find the longest common
substring, but are you aware that python has a module in the standard
library that already does this?  In the difflib module, the SequenceMatcher
class can compare two sequences and extract the longest common sequence of
elements from it, like this:


Thanks for the tip. I've played around with it, but I think it doesn't 
help in the OP's situation. "SequenceMatcher.find_longest_match()" just 
finds the first common block:


Python 2.7.1+ (r271:86832, Apr 11 2011, 18:05:24)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import difflib
>>> first = [0, 1, 2, 3, 0, 4, 5, 6, 0]
>>> second = [1, 2, 3, 4, 5, 6]
>>> match = difflib.SequenceMatcher(None, first, second)
>>> match.find_longest_match(0, len(first), 0, len(second))
Match(a=1, b=0, size=3)

Here it returns just [1, 2, 3] but misses [4, 5, 6]. So you would have 
to adjust the lower limits to get it. 
"SequenceMatcher.get_matching_blocks()" seems to be a better choice:


>>> match.get_matching_blocks()
[Match(a=1, b=0, size=3), Match(a=5, b=3, size=3), Match(a=9, b=6, size=0)]

Now you get [1, 2, 3] and [4, 5, 6]. But if the two blocks are in the 
reversed order, there is no longest common subsequence [1, 2, 3, 4, 5, 
6] any more and "SequenceMatcher" only finds one part (apparently it 
chooses the first it comes across in the first list if both have the 
same length):


>>> first = [0, 1, 2, 3, 0, 4, 5, 6, 0]
>>> second = [4, 5, 6, 1, 2, 3]
>>> match = difflib.SequenceMatcher(None, first, second)
>>> match.find_longest_match(0, len(first), 0, len(second))
Match(a=1, b=3, size=3)
>>> match.get_matching_blocks()
[Match(a=1, b=3, size=3), Match(a=9, b=6, size=0)]

From both methods you get [1, 2, 3].

As I've learnt during this tests, there is a difference between 
subsequences and substrings:

http://en.wikipedia.org/wiki/Subsequence#Substring_vs._subsequence

If I've understood the OP right, he/she wants to find all common 
substrings with a minimum length regardless of their order in the strings.


Bye, Andreas
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-13 Thread Andreas Perstinger

On 2011-11-11 14:44, lina wrote:

You are right, I did not think of this parts before. and actually the
initiative wish was to find possible paths, I mean, possible
substrings, all possible substrings. not the longest one, but at
least bigger than 3.


I had some time today and since you have changed your initial task (from 
finding the longest common path to finding all common paths with a 
minimum length) I've modified the code and came up with the following 
solution:


def AllCommonPaths(list1, list2, minimum=3):
""" finds all common paths with a minimum length (default = 3)"""

# First we have to initialize the necessary variables:
# M is an empty table where we will store all found matches
# (regardless of their length)

M = [[0] * (len(list2)) for i in range(len(list1))]

# length is a dictionary where we store the length of each common
# path. The keys are the starting positions ot the paths in list1.

length = {}

# result will be a list of of all found paths

result =[]

# Now the hard work begins:
# Each element of list1 is compared to each element in list2
# (x is the index for list1, y is the index for list2).
# If we find a match, we store the distance to the starting point
# of the matching block. If we are in the left-most column (x == 0)
# or in the upper-most row (y == 0) we have to set the starting
# point ourself because we would get negative indexes if we look
# for the predecessor cell (M[x - 1][y - 1]). Else, we are one
# element farther away as the element before, so we add 1 to its
# value.

for x in range(len(list1)):
for y in range(len(list2)):
if list1[x] == list2[y]:
if (x == 0) or (y == 0):
M[x][y] = 1
else:
M[x][y] = M[x - 1][y - 1] + 1

# To get everything done in one pass, we update the length of
# the found path in our dictionary if it is longer than the minimum
# length. Thus we don't have to get through the whole table a
# second time to get all found paths with the minimum length (we
# don't know yet if we are already at the end of the matching
# block).

if M[x][y] >= minimum:
length[x + 1 - M[x][y]] = M[x][y]


# We now have for all matching blocks their starting
# position in list1 and their length. Now we cut out this parts
# and create our resulting list

for pos in length:
result.append(list1[pos:pos + length[pos]])

return result

I've tried to explain what I have done, but I'm sure you will still have 
questions :-).


Is this close to what you want?

Bye, Andreas

PS: Here's the function again without comments:

def AllCommonPaths(list1, list2, minimum=3):
""" finds all common paths with a minimum length (default = 3)"""

M = [[0] * (len(list2)) for i in range(len(list1))]
length = {}
result =[]

for x in range(len(list1)):
for y in range(len(list2)):
if list1[x] == list2[y]:
if (x == 0) or (y == 0):
M[x][y] = 1
else:
M[x][y] = M[x - 1][y - 1] + 1
if M[x][y] >= minimum:
length[x + 1 - M[x][y]] = M[x][y]

for pos in length:
result.append(list1[pos:pos + length[pos]])

return result
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-13 Thread Dave Angel

On 11/13/2011 08:06 AM, lina wrote:


Finally, if I am not wrong again, I feel I am kinda of starting
figuring out what's going on. Why it's None.

The main mistake here I use result = result.append(something)
the "="

I checked the print(id(result)) and print(id(result.append()),

For the NoneType they shared the same id 8823392 in my laptop. is it
temporary address?

None is a unique object, deliberately.  No matter how many times people 
create None, it'll always be the same object.  So

a= None
b = x.append(y)
a is b   #(true)
id(a) == id(b)#(true)

Similarly  True and False are unique objects.

Other objects which are equal to each other may or may not have the same 
ID;  you should not count on it.  For example,

x = 45+3
y = 6*8

Checking  (x==y) is true, of course.  But checking (x is y) is 
indeterminate.  It may be true for the first ten tests you do, and false 
next time


--

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-13 Thread lina
On Sun, Nov 13, 2011 at 12:40 AM, Andreas Perstinger
 wrote:
> On 2011-11-12 16:24, lina wrote:
>>
>> Thanks, ^_^, now better.
>
> No, I'm afraid you are still not understanding.
>
>> I checked, the sublist (list) here can't be as a key of the results
>> (dict).
>
> "result" isn't a dictionary. It started as an empty list and later becomes a
> null object ("NoneType").
>
> You must not forget that you are inside a for-loop. Simplified your
> situation is like this:
>
 result = []
 for i in range(1,10):
> ...     print("Iteration {0}, result = {1}".format(i, result))
> ...     result = result.append(i)
> ...
> Iteration 1, result = []
> Iteration 2, result = None
> Traceback (most recent call last):
>  File "", line 3, in 
> AttributeError: 'NoneType' object has no attribute 'append'
>
> As you see the error happens in the *second* iteration, because result is no
> list any more.
> Dave gave you already the explanation: functions and method always return a
> value in Python. If the don't have a return statement they return "None".
>
> Another simple example:
>
 a = print("Test")
> Test
>
> "print" is a function which prints out the text you passed to it and you
> usually aren't interested in its return value. But every function/method in
> Python returns something. You save this value in "a"
>
 print(a)
> None
>
> As you see the return value of "print" is "None".
>
 a.append(x)
> Traceback (most recent call last):
>  File "", line 1, in 
> AttributeError: 'NoneType' object has no attribute 'append'
>
> Same error as above, because "NoneType" objects (null objects) don't have a
> method "append".
>
> I also think you mix two different ways to add an element to a list:
>
> result.append(x)
>
> is equivalent to
>
> result = result + [x] (that's what you will use in other languages)

Finally, if I am not wrong again, I feel I am kinda of starting
figuring out what's going on. Why it's None.

The main mistake here I use result = result.append(something)
the "="

I checked the print(id(result)) and print(id(result.append()),

For the NoneType they shared the same id 8823392 in my laptop. is it
temporary address?

>
> HTH, Andreas

Really helps, Thanks again.

 haha ...for the emails from the list I used to read several times,
again and again to understand.

Best regards,

> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-12 Thread Joel Goldstick
On Sat, Nov 12, 2011 at 11:40 AM, Andreas Perstinger <
andreas.perstin...@gmx.net> wrote:

> On 2011-11-12 16:24, lina wrote:
>
>> Thanks, ^_^, now better.
>>
>
> No, I'm afraid you are still not understanding.
>
>
>  I checked, the sublist (list) here can't be as a key of the results
>> (dict).
>>
>
> "result" isn't a dictionary. It started as an empty list and later becomes
> a null object ("NoneType").
>
> You must not forget that you are inside a for-loop. Simplified your
> situation is like this:
>
> >>> result = []
> >>> for i in range(1,10):
> ... print("Iteration {0}, result = {1}".format(i, result))
> ... result = result.append(i)
> ...
> Iteration 1, result = []
> Iteration 2, result = None
>
> Traceback (most recent call last):
>  File "", line 3, in 
>
> AttributeError: 'NoneType' object has no attribute 'append'
>
> As you see the error happens in the *second* iteration, because result is
> no list any more.
> Dave gave you already the explanation: functions and method always return
> a value in Python. If the don't have a return statement they return "None".
>
> Another simple example:
>
> >>> a = print("Test")
> Test
>
> "print" is a function which prints out the text you passed to it and you
> usually aren't interested in its return value. But every function/method in
> Python returns something. You save this value in "a"
>
> >>> print(a)
> None
>
> As you see the return value of "print" is "None".
>
> >>> a.append(x)
>
> Traceback (most recent call last):
>  File "", line 1, in 
>
> AttributeError: 'NoneType' object has no attribute 'append'
>
> Same error as above, because "NoneType" objects (null objects) don't have
> a method "append".
>
> I also think you mix two different ways to add an element to a list:
>
> result.append(x)
>
> is equivalent to
>
> result = result + [x] (that's what you will use in other languages)
>
> HTH, Andreas
>
> __**_
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/**mailman/listinfo/tutor
>

This is a fascinating thread in the same way that people can't help slowing
down and looking at a car crash on the side of the road is fascinating.

The original poster it seems is new to programming and has offered that he
likes to run before he learns to walk.  That doesn't seem like a good
proclamation to make when you are asking people to help you.  Anyway, this
particular piece of code is pretty tricky stuff.  It involves understanding
list comprehensions, multidimensional lists, and slices.  All things that
take more than a passing interest in to grasp.

Furthermore, the algorithm itself is pretty tricky.  If you follow the link
to the wikipedia article:
http://en.wikipedia.org/wiki/Longest_common_substring you learn that
understanding the algorithm requires understanding of trees (Generalized
suffix trees at that!).

I copied the code from the original article and played around with it for
an hour or so to understand it.  I didn't get the answers that I expected
either.

If you are learning coding in general and python in particular this
exercise seems unproductive.  Work through basic concepts.  If you don't
like one set of tutorials, find a different one.  if you don't like to
read, check out google videos for learning python, or youtube for that
matter.  But taking on a concise algorithm that solves a problem that would
challenge a 3rd year CS student doesn't seem like a good idea



-- 
Joel Goldstick
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-12 Thread Andreas Perstinger

On 2011-11-12 16:24, lina wrote:

Thanks, ^_^, now better.


No, I'm afraid you are still not understanding.


I checked, the sublist (list) here can't be as a key of the results (dict).


"result" isn't a dictionary. It started as an empty list and later 
becomes a null object ("NoneType").


You must not forget that you are inside a for-loop. Simplified your 
situation is like this:


>>> result = []
>>> for i in range(1,10):
... print("Iteration {0}, result = {1}".format(i, result))
... result = result.append(i)
...
Iteration 1, result = []
Iteration 2, result = None
Traceback (most recent call last):
  File "", line 3, in 
AttributeError: 'NoneType' object has no attribute 'append'

As you see the error happens in the *second* iteration, because result 
is no list any more.
Dave gave you already the explanation: functions and method always 
return a value in Python. If the don't have a return statement they 
return "None".


Another simple example:

>>> a = print("Test")
Test

"print" is a function which prints out the text you passed to it and you 
usually aren't interested in its return value. But every function/method 
in Python returns something. You save this value in "a"


>>> print(a)
None

As you see the return value of "print" is "None".

>>> a.append(x)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'NoneType' object has no attribute 'append'

Same error as above, because "NoneType" objects (null objects) don't 
have a method "append".


I also think you mix two different ways to add an element to a list:

result.append(x)

is equivalent to

result = result + [x] (that's what you will use in other languages)

HTH, Andreas
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-12 Thread lina
On Sat, Nov 12, 2011 at 10:57 PM, Dave Angel  wrote:
> On 11/12/2011 09:48 AM, lina wrote:
>>
>> On Sat, Nov 12, 2011 at 9:22 PM, Dave Angel  wrote:
>>>
>>> On 11/12/2011 03:54 AM, lina wrote:

 
 The one I tried :
                 if longest>= 2:
                     sublist=L1[x_longest-longest:x_longest]
                     result=result.append(sublist)
                     if sublist not in sublists:
                          sublists.append(sublist)

 the $ python3 CommonSublists.py
 atom-pair_1.txt atom-pair_2.txt
 Traceback (most recent call last):
   File "CommonSublists.py", line 47, in
     print(CommonSublist(a,b))
   File "CommonSublists.py", line 24, in CommonSublist
     result=result.append(sublist)
 AttributeError: 'NoneType' object has no attribute 'append'

 in local domain I set the result=[]
 I don't know why it complains its NoneType, since the "result" is
 nearly the same as "sublists".

>>> Assuming this snippet is part of a loop, I see the problem:
>>>
>>> result  = result.append(sublist)
>>>
>>> list.append() returns none.  It modifies the list object in place, but it
>>> doesn't return anything.  So that statement modifies the result object,
>>> appending the sublist to it, then it sets it to None.  The second time
>>> around you see that error.
>>
>> I am sorry.  haha ... still lack of understanding above sentence.
>>
> a
>>
>> ['3', '5', '7', '8', '9']
>
> d.append(a)
> d
>>
>> [['3', '5', '7', '8', '9']]
>
> type(a)
>>
>> 
>>
>> Sorry and thanks, best regards,
>>
>> lina
>>
>>>
>>> In general, most methods in the standard library either modify the object
>>> they're working on, OR they return something.   The append method is in
>>> the
>>> first category.
>>>
>>>
>
> To keep it simple, I'm using three separate variables.  d and a are as you
> tried to show above.  Now what happens when I append?
>
> Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
> [GCC 4.5.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
 d = []
 a = [3, 5, 7]
 xxx = d.append(a)
 print(repr(xxx))
> None
 print d
> [[3, 5, 7]]
>
> Notice that d does change as we expected.  But xxx, the return value, is
> None. The append() method doesn't return any useful value, so don't assign
> it to anything.

Thanks, ^_^, now better.

I checked, the sublist (list) here can't be as a key of the results (dict).

actually I also wish to get the occurence of those sublist in the
script, except using external one in command line as uniq -c.

^_^ Have a nice weekend,

>
> The statement in your code that's wrong is
>    result = result.append(sublist)
>
> The final value that goes into result is None, no matter what the earlier
> values of result and sublist were.
>
> --
>
> DaveA
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-12 Thread Dave Angel

On 11/12/2011 09:48 AM, lina wrote:

On Sat, Nov 12, 2011 at 9:22 PM, Dave Angel  wrote:

On 11/12/2011 03:54 AM, lina wrote:



The one I tried :
 if longest>= 2:
 sublist=L1[x_longest-longest:x_longest]
 result=result.append(sublist)
 if sublist not in sublists:
  sublists.append(sublist)

the $ python3 CommonSublists.py
atom-pair_1.txt atom-pair_2.txt
Traceback (most recent call last):
   File "CommonSublists.py", line 47, in
 print(CommonSublist(a,b))
   File "CommonSublists.py", line 24, in CommonSublist
 result=result.append(sublist)
AttributeError: 'NoneType' object has no attribute 'append'

in local domain I set the result=[]
I don't know why it complains its NoneType, since the "result" is
nearly the same as "sublists".


Assuming this snippet is part of a loop, I see the problem:

result  = result.append(sublist)

list.append() returns none.  It modifies the list object in place, but it
doesn't return anything.  So that statement modifies the result object,
appending the sublist to it, then it sets it to None.  The second time
around you see that error.


I am sorry.  haha ... still lack of understanding above sentence.


a

['3', '5', '7', '8', '9']

d.append(a)
d

[['3', '5', '7', '8', '9']]

type(a)



Sorry and thanks, best regards,

lina



In general, most methods in the standard library either modify the object
they're working on, OR they return something.   The append method is in the
first category.




To keep it simple, I'm using three separate variables.  d and a are as 
you tried to show above.  Now what happens when I append?


Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> d = []
>>> a = [3, 5, 7]
>>> xxx = d.append(a)
>>> print(repr(xxx))
None
>>> print d
[[3, 5, 7]]

Notice that d does change as we expected.  But xxx, the return value, is 
None. The append() method doesn't return any useful value, so don't 
assign it to anything.


The statement in your code that's wrong is
result = result.append(sublist)

The final value that goes into result is None, no matter what the 
earlier values of result and sublist were.


--

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-12 Thread lina
On Sat, Nov 12, 2011 at 9:22 PM, Dave Angel  wrote:
> On 11/12/2011 03:54 AM, lina wrote:
>>
>> 
>> The one I tried :
>>                 if longest>= 2:
>>                     sublist=L1[x_longest-longest:x_longest]
>>                     result=result.append(sublist)
>>                     if sublist not in sublists:
>>                          sublists.append(sublist)
>>
>> the $ python3 CommonSublists.py
>> atom-pair_1.txt atom-pair_2.txt
>> Traceback (most recent call last):
>>   File "CommonSublists.py", line 47, in
>>     print(CommonSublist(a,b))
>>   File "CommonSublists.py", line 24, in CommonSublist
>>     result=result.append(sublist)
>> AttributeError: 'NoneType' object has no attribute 'append'
>>
>> in local domain I set the result=[]
>> I don't know why it complains its NoneType, since the "result" is
>> nearly the same as "sublists".
>>
> Assuming this snippet is part of a loop, I see the problem:
>
> result  = result.append(sublist)
>
> list.append() returns none.  It modifies the list object in place, but it
> doesn't return anything.  So that statement modifies the result object,
> appending the sublist to it, then it sets it to None.  The second time
> around you see that error.

I am sorry.  haha ... still lack of understanding above sentence.

>>> a
['3', '5', '7', '8', '9']
>>> d.append(a)
>>> d
[['3', '5', '7', '8', '9']]
>>> type(a)


Sorry and thanks, best regards,

lina

>
> In general, most methods in the standard library either modify the object
> they're working on, OR they return something.   The append method is in the
> first category.
>
>
> --
>
> DaveA
>
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-12 Thread Dave Angel

On 11/12/2011 03:54 AM, lina wrote:


The one I tried :
 if longest>= 2:
 sublist=L1[x_longest-longest:x_longest]
 result=result.append(sublist)
 if sublist not in sublists:
  sublists.append(sublist)

the $ python3 CommonSublists.py
atom-pair_1.txt atom-pair_2.txt
Traceback (most recent call last):
   File "CommonSublists.py", line 47, in
 print(CommonSublist(a,b))
   File "CommonSublists.py", line 24, in CommonSublist
 result=result.append(sublist)
AttributeError: 'NoneType' object has no attribute 'append'

in local domain I set the result=[]
I don't know why it complains its NoneType, since the "result" is
nearly the same as "sublists".


Assuming this snippet is part of a loop, I see the problem:

result  = result.append(sublist)

list.append() returns none.  It modifies the list object in place, but 
it doesn't return anything.  So that statement modifies the result 
object, appending the sublist to it, then it sets it to None.  The 
second time around you see that error.


In general, most methods in the standard library either modify the 
object they're working on, OR they return something.   The append method 
is in the first category.



--

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-12 Thread lina


Sorry I finished last email in two different time,

while:
> INFILEEXT=".doc"
>
>
> def CommonSublist(L1, L2):
>    sublist=[]
>    sublists=[]
>    result=[]
>    M = [[0]*(1+len(L2)) for i in range(1+len(L1))]
>    longest, x_longest = 0, 0
>    for x in range(1,1+len(L1)):
>        for y in range(1,1+len(L2)):
>            if L1[x-1] == L2[y-1]:
>                M[x][y] = M[x-1][y-1]+1
>                if M[x][y] > longest:
>                    longest = M[x][y]
>                    x_longest = x
>                if longest >= 2:
>                    sublist=L1[x_longest-longest:x_longest]
>                    if sublist not in sublists:
>                         sublists.append(sublist)
>
>
>            else:
>                    M[x][y] = 0
>
>    return sublists
>
>
>
> if __name__=="__main__":
>
>
>    for i in range(1,11):
>        for j in range(1,11):
>            if i != j:
>                fileone="atom-pair_"+str(i)+".txt"
>                filetwo="atom-pair_"+str(j)+".txt"

correction: here not ".txt", it's ".doc"
>                a=open(fileone,"r").readline().strip().split(' ')
>                b=open(filetwo,"r").readline().strip().split(' ')
>                print(fileone,filetwo)
>                print(CommonSublist(a,b))
>
> The output results:
>
the output is:

atom-pair_10.doc atom-pair_8.doc
[['75', '64'], ['13', '64', '75'], ['64', '62', '75', '16']]
atom-pair_10.doc atom-pair_9.doc
[['65', '46'], ['13', '75', '64']]

seems a bit better than before.

> atom-pair_10.txt atom-pair_8.txt
> [["'75',", "'64',"], ["'13',", "'64',", "'75',"], ["'64',", "'62',",
> "'75',", "'16',"]]
> atom-pair_10.txt atom-pair_9.txt
> [["'65',", "'46',"], ["'13',", "'75',", "'64',"]]
>

> the $ python3 CommonSublists.py
> atom-pair_1.txt atom-pair_2.txt
> Traceback (most recent call last):
>  File "CommonSublists.py", line 47, in 
>    print(CommonSublist(a,b))
>  File "CommonSublists.py", line 24, in CommonSublist
>    result=result.append(sublist)
> AttributeError: 'NoneType' object has no attribute 'append'
>
> in local domain I set the result=[]
> I don't know why it complains its NoneType, since the "result" is
> nearly the same as "sublists".
>

Thanks with best regards,
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-12 Thread lina
On Sat, Nov 12, 2011 at 5:49 AM, Andreas Perstinger
 wrote:
> First, just a little rant :-)
> It doesn't help to randomly change some lines or introduce some new concepts
> you don't understand yet and then hope to get the right result. Your chances
> are very small that this will be succesful.
> You should try to understand some basic concepts first and build on them.
> From your postings the last weeks and especially from today I have the
> impression that you still don't understand how fundamental programming
> concepts work: for-loops, differences between data types (strings, lists,
> sets, ...)
> Honestly, have you already read any programming tutorial? (You'll find a big
> list at http://wiki.python.org/moin/BeginnersGuide/NonProgrammers )? At the
> moment it looks like you are just copying some code snippets from different
> places and then you hopelessly try to modify them to suit your needs. IMHO
> the problems you want to solve are a little too big for you right now.
>
> Nevertheless, here are some comments:

Thanks, Those are very valuable comments. Since I read your post till
the following hours, my mind was haunted by what you pointed out.
The reflection went far away. I had/have a VERY BAD HABIT in learning
and doing things.
My father used to say I was the person did not know how to walk, but
started to run.
Later I realized in my life, such as I barely read a manual/usage/map,
but started messing things up.
(I did destory something I newly bought without spending 2 mins
reading the usage, I could not forget because it's expensive, haha
...).
In the past, for difficulty questions I could do pretty not bad, but
for basic concepts or step by step detailed things I failed more than
once.

But also very honestly answering, that I did try to read some books,
one is dive into python, another is learning python the hard way. and
now I have programming python by Mark Lutz, and another python book on
bedside.
The mainly problems was that I felt nothing when I just read for
reading. forget so easily what I read.
( Now I am a little worried, the bad habit I have had will affect me
go far away or build something serious. Sigh ... )
In the past hours, I tried to read the basic concepts, but get lost
(not lost, just mind becomes empty and inactive) in minutes.

Thanks again for your pointing out. I will remind myself in future.
>
>> Based on former advice, I made a correction/modification on the belowba
>> code.
>>
>> 1] the set and subgroup does not work, here I wish to put all the
>> subgroup in a big set, the set like
>
> That's a good idea, but you don't use the set correctly.
>
>> subgroups=[]
>> subgroup=[]
>> def LongestCommonSubstring(S1, S2):
>
> I think it's better to move "subgroups" and "subgroup" into the function.
> (I've noticed that in most of your scripts you are using a lot of global
> variables. IMHO that's not the best programming style. Do you know what
> "global/local variables", "namespace", "scope" mean?)
>
> You are defining "subgroups" as an empty list, but later you want to use it
> as a set. Thus, you should define it as an empty set:
>
> subgroups = set()
>
> You are also defining "subgroup" as an empty list, but later you assign a
> slice of "S1" to it. Since "S1" is a string, the slice is also a string.
> Therefore:
>
> subgroup = ""
>
>>      M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))]
>
> Peter told you already why "xrange" doesn't work in Python 3. But instead of
> using an alias like
>
> xrange = range
>
> IMHO it's better to change it in the code directly.
>
>>      longest, x_longest = 0, 0
>>      for x in xrange(1,1+len(S1)):
>>          for y in xrange(1,1+len(S2)):
>>              if S1[x-1] == S2[y-1]:
>>                  M[x][y] = M[x-1][y-1]+1
>>                  if M[x][y]>  longest:
>>                      longest = M[x][y]
>>                      x_longest = x
>>                  if longest>= 3:
>>                      subgroup=S1[x_longest-longest:x_longest]
>>                      subgroups=set([subgroup])
>
> Here you overwrite in the first iteration your original empty list
> "subgroups" with the set of the list which contains the string "subgroup" as
> its only element. Do you really understand this line?
> And in all the following iterations you are overwriting this one-element set
> with another one-element set (the next "subgroup").
> If you want to add an element to an existing set instead of replacing it,
> you have to use the "add()"-method for adding an element to a set:
>
> subgroups.add(subgroup)
>
> This will add the string "subgroup" as a new element to the set "subgroups".
>
>>                      print(subgroups)
>>              else:
>>                      M[x][y] = 0
>>
>>      return S1[x_longest-longest:x_longest]
>
> Here you probably want to return the set "subgroups":
>
> return subgroups
I will return to this parts later.

Based on your advice, I updated the code to below one (which is partially work);

#!/usr/bin/python3

impo

Re: [Tutor] longest common substring

2011-11-11 Thread Andreas Perstinger

First, just a little rant :-)
It doesn't help to randomly change some lines or introduce some new 
concepts you don't understand yet and then hope to get the right result. 
Your chances are very small that this will be succesful.

You should try to understand some basic concepts first and build on them.
From your postings the last weeks and especially from today I have the 
impression that you still don't understand how fundamental programming 
concepts work: for-loops, differences between data types (strings, 
lists, sets, ...)
Honestly, have you already read any programming tutorial? (You'll find a 
big list at http://wiki.python.org/moin/BeginnersGuide/NonProgrammers )? 
At the moment it looks like you are just copying some code snippets from 
different places and then you hopelessly try to modify them to suit your 
needs. IMHO the problems you want to solve are a little too big for you 
right now.


Nevertheless, here are some comments:


Based on former advice, I made a correction/modification on the below code.

1] the set and subgroup does not work, here I wish to put all the
subgroup in a big set, the set like


That's a good idea, but you don't use the set correctly.

> subgroups=[]
> subgroup=[]
> def LongestCommonSubstring(S1, S2):

I think it's better to move "subgroups" and "subgroup" into the 
function. (I've noticed that in most of your scripts you are using a lot 
of global variables. IMHO that's not the best programming style. Do you 
know what "global/local variables", "namespace", "scope" mean?)


You are defining "subgroups" as an empty list, but later you want to use 
it as a set. Thus, you should define it as an empty set:


subgroups = set()

You are also defining "subgroup" as an empty list, but later you assign 
a slice of "S1" to it. Since "S1" is a string, the slice is also a 
string. Therefore:


subgroup = ""

>  M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))]

Peter told you already why "xrange" doesn't work in Python 3. But 
instead of using an alias like


xrange = range

IMHO it's better to change it in the code directly.

>  longest, x_longest = 0, 0
>  for x in xrange(1,1+len(S1)):
>  for y in xrange(1,1+len(S2)):
>  if S1[x-1] == S2[y-1]:
>  M[x][y] = M[x-1][y-1]+1
>  if M[x][y]>  longest:
>  longest = M[x][y]
>  x_longest = x
>  if longest>= 3:
>  subgroup=S1[x_longest-longest:x_longest]
>  subgroups=set([subgroup])

Here you overwrite in the first iteration your original empty list 
"subgroups" with the set of the list which contains the string 
"subgroup" as its only element. Do you really understand this line?
And in all the following iterations you are overwriting this one-element 
set with another one-element set (the next "subgroup").
If you want to add an element to an existing set instead of replacing 
it, you have to use the "add()"-method for adding an element to a set:


subgroups.add(subgroup)

This will add the string "subgroup" as a new element to the set "subgroups".

>  print(subgroups)
>  else:
>  M[x][y] = 0
>
>  return S1[x_longest-longest:x_longest]

Here you probably want to return the set "subgroups":

return subgroups



2] I still have trouble in reading files, mainly about not read "" etc.


The problem is that in your data files there is just this big one-line 
string. AFAIK you have produced these data files yourself, haven't you? 
In that case it would be better to change the way how you save the data 
(be it a well-formatted string or a list or something else) instead of 
trying to fix it here (in this script).


Bye, Andreas
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-11 Thread lina


I wrote a crazy one, to find the common group:

Please jump to the end part of this code:

https://docs.google.com/open?id=0B93SVRfpVVg3MDUzYzI1MDYtNmI5MS00MmZkLTlmMTctNmE3Y2EyYzIyZTk2

Thanks again,
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-11 Thread Jerry Hill
There's nothing wrong with writing your own code to find the longest common
substring, but are you aware that python has a module in the standard
library that already does this?  In the difflib module, the SequenceMatcher
class can compare two sequences and extract the longest common sequence of
elements from it, like this:

Code:
import difflib

a = [1, 2, 3, 7]
b = [2, 3, 7]

seq_matcher = difflib.SequenceMatcher(None, a, b)
print seq_matcher.find_longest_match(0, len(a), 0, len(b))

Outputs:
Match(a=1, b=0, size=3)

See http://docs.python.org/library/difflib.html#sequencematcher-objects for
lots of details.  The SequenceMatcher class can do a lot more than finding
the common substrings, as you might expect.  The Module of the Week article
for difflib may also be of interest:
http://www.doughellmann.com/PyMOTW/difflib/index.html

-- 
Jerry
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-11 Thread lina
On Fri, Nov 11, 2011 at 9:10 PM, Andreas Perstinger
 wrote:
> On 2011-11-11 05:14, lina wrote:
>>
>> def LongestCommonSubstring(S1, S2):
>>     M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))] ## creat 4*5 matrix
>>     longest, x_longest = 0, 0
>>     for x in xrange(1,1+len(S1)):                 ## read each row
>>         for y in xrange(1,1+len(S2)):             ## read each coloumn
>>             if S1[x-1] == S2[y-1]:
>>                 M[x][y] = M[x-1][y-1]+1
>>                 if M[x][y]>  longest:
>>                     longest = M[x][y]
>>                     x_longest = x
>>                 else:
>>                     M[x][y] = 0
>>     return S1[x_longest-longest:x_longest]
>
> That's still not the right version.
>
> If you compare your version to the one at wikibooks (
> http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_substring#Python
> ), you'll see that the else-branch is wrongly indented (one level too deep).
> It belongs to the first if-comparison:
>
> if S1 ...
>     M[x][y] ...
>     if M[x][y] ...
>        ...
> else: ...
>
>
>> if __name__=="__main__":
>>
>>     a=open("atom-pair_4.txt","r").readline().strip()
>>
>>     b=open("atom-pair_8.txt","r").readline().strip()
>>
>>
>> print(LongestCommonSubstring(LongestCommonSubstring(a,a),LongestCommonSubstring(b,b)))
>
>                                    ^^^
> ??? What do you try to accomplish here ???
> You call "LongestCommonSubstring" with identical strings, thus the result
> must be the same string.
> Why not
>
> print(LongestCommonSubstring(a, b))
>
> as you did the line below with "c" and "d"?
>
> Further more I think that your problems start with bad data files. In every
> file there is just one very long line which looks like a string
> representation of a list of two-digits strings. This complicates further
> processing because you have to deal with all the unnecessary commas, blanks
> and single quotes between your numbers and the square brackets at the
> beginning and the end of the line.
>
>
>>  $ python3 LongestCommonSubstring.py
>> 2189
>> ['
>> ['82']
>>
>> The results are wrong.
>> c, d are the string from file atom-pair_4,txt, exactly the same as a,
>> d is the same as b.
>>
>> and even for (c,d) results are not correct, visually we can see some
>> similar groups, not mention the longest groups.
>
> And even if you use the correct function from wikibooks I can anticipate
> another problem :-)
> The implementation from wikibooks just returns the first common substring
> which it finds in the first string:
>
 WikibooksLongestCommonSubstring("ABAB","BABA")
> 'ABA'
 WikibooksLongestCommonSubstring("BABA", "ABAB")
> 'BAB'
>
> If there are more possible substrings with the same length (as in the
> example above) only the first one is returned.
>
> But in your example there are at least two different pathways (I've found
> three) which have the same length, as changing the order of the parameters
> will show you:
>
 WikibooksLongestCommonSubstring(c, d)
> ['61', '70', '61']
 WikibooksLongestCommonSubstring(d, c)
> ['83', '61', '83']

The residues of WikibooksLongestCommonSubstring(d, c) and
WikibooksLongestCommonSubstring(c,d) is different very largely,

I mean, it might be totally different.

so the possible subgroups are the union of groups of (d,c) and (c,d)?

I am really lack a programed-brain to think.

Thanks again,

>
> Bye, Andreas
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-11 Thread lina


Based on former advice, I made a correction/modification on the below code.

1] the set and subgroup does not work, here I wish to put all the
subgroup in a big set, the set like
$ python3 LongestCommonSubstring.py | uniq
{"1',"}
{"1', "}
{"1', '"}
{"1', '8"}
{"1', '82"}
{"1', '82'"}
{"1', '82',"}
{"1', '82', "}
{"1', '82', '"}
{"6', '61', '6"}
{"', '61', '63'"}
{"', '61', '63',"}
{"', '61', '63', "}
{"', '61', '63', '"}
{"', '61', '63', '6"}
{"', '61', '70', '61"}
{"', '61', '70', '61'"}
{"', '83', '61', '83',"}
{"', '83', '61', '83', "}
{"', '83', '61', '83', '"}

Please kindly notice I added a pipeline with uniq at the end, the true
prints were lots of replications, I don't know how to handle it in the
python code.

2] I still have trouble in reading files, mainly about not read "" etc.

Thanks with best regards,

#!/usr/bin/python3

import os.path

xrange = range

subgroups=[]
subgroup=[]
def LongestCommonSubstring(S1, S2):
M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))]
longest, x_longest = 0, 0
for x in xrange(1,1+len(S1)):
for y in xrange(1,1+len(S2)):
if S1[x-1] == S2[y-1]:
M[x][y] = M[x-1][y-1]+1
if M[x][y] > longest:
longest = M[x][y]
x_longest = x
if longest >= 3:
subgroup=S1[x_longest-longest:x_longest]
subgroups=set([subgroup])
print(subgroups)
else:
M[x][y] = 0

return S1[x_longest-longest:x_longest]


if __name__=="__main__":

a=open("atom-pair_4.txt","r").readline().strip()

b=open("atom-pair_8.txt","r").readline().strip()

LongestCommonSubstring(a,b)
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-11 Thread lina
On Fri, Nov 11, 2011 at 9:10 PM, Andreas Perstinger
 wrote:
> On 2011-11-11 05:14, lina wrote:
>>
>> def LongestCommonSubstring(S1, S2):
>>     M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))] ## creat 4*5 matrix
>>     longest, x_longest = 0, 0
>>     for x in xrange(1,1+len(S1)):                 ## read each row
>>         for y in xrange(1,1+len(S2)):             ## read each coloumn
>>             if S1[x-1] == S2[y-1]:
>>                 M[x][y] = M[x-1][y-1]+1
>>                 if M[x][y]>  longest:
>>                     longest = M[x][y]
>>                     x_longest = x
>>                 else:
>>                     M[x][y] = 0
>>     return S1[x_longest-longest:x_longest]
>
> That's still not the right version.
>
> If you compare your version to the one at wikibooks (
> http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_substring#Python
> ), you'll see that the else-branch is wrongly indented (one level too deep).
> It belongs to the first if-comparison:
>
> if S1 ...
>     M[x][y] ...
>     if M[x][y] ...
>        ...
> else: ...

Thanks, I was so careless and most important failure to give a deep
understanding.
>
>
>> if __name__=="__main__":
>>
>>     a=open("atom-pair_4.txt","r").readline().strip()
>>
>>     b=open("atom-pair_8.txt","r").readline().strip()
>>
>>
>> print(LongestCommonSubstring(LongestCommonSubstring(a,a),LongestCommonSubstring(b,b)))
>
>                                    ^^^
> ??? What do you try to accomplish here ???
> You call "LongestCommonSubstring" with identical strings, thus the result
> must be the same string.
Yes, the results is the same string,
actually I want to remove the distractions of "," and as you noticed,
the blanks and single quotes between numbers and the square brackets
at the two terminal of the line.
a=open("atom-pair_4.txt","r").readline().strip()
read so many unnecessary and I have a trouble of removing those distractions.

> Why not
>
> print(LongestCommonSubstring(a, b))
>
> as you did the line below with "c" and "d"?
>
> Further more I think that your problems start with bad data files. In every
> file there is just one very long line which looks like a string
> representation of a list of two-digits strings. This complicates further
> processing because you have to deal with all the unnecessary commas, blanks
> and single quotes between your numbers and the square brackets at the
> beginning and the end of the line.
>
>
>>  $ python3 LongestCommonSubstring.py
>> 2189
>> ['
>> ['82']
>>
>> The results are wrong.
>> c, d are the string from file atom-pair_4,txt, exactly the same as a,
>> d is the same as b.
>>
>> and even for (c,d) results are not correct, visually we can see some
>> similar groups, not mention the longest groups.
>
> And even if you use the correct function from wikibooks I can anticipate
> another problem :-)
> The implementation from wikibooks just returns the first common substring
> which it finds in the first string:
>
 WikibooksLongestCommonSubstring("ABAB","BABA")
> 'ABA'
 WikibooksLongestCommonSubstring("BABA", "ABAB")
> 'BAB'
>
> If there are more possible substrings with the same length (as in the
> example above) only the first one is returned.
You are right, I did not think of this parts before.

and actually the initiative wish was to find possible paths, I mean,
possible substrings, all possible substrings. not the longest one, but
at least bigger than 3.
>
> But in your example there are at least two different pathways (I've found
> three) which have the same length, as changing the order of the parameters
> will show you:
>
 WikibooksLongestCommonSubstring(c, d)
> ['61', '70', '61']
 WikibooksLongestCommonSubstring(d, c)
> ['83', '61', '83']
>
> Bye, Andreas

Thanks for your time,

Best regards,
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-11 Thread Andreas Perstinger

On 2011-11-11 05:14, lina wrote:

def LongestCommonSubstring(S1, S2):
 M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))] ## creat 4*5 matrix
 longest, x_longest = 0, 0
 for x in xrange(1,1+len(S1)): ## read each row
 for y in xrange(1,1+len(S2)): ## read each coloumn
 if S1[x-1] == S2[y-1]:
 M[x][y] = M[x-1][y-1]+1
 if M[x][y]>  longest:
 longest = M[x][y]
 x_longest = x
 else:
 M[x][y] = 0
 return S1[x_longest-longest:x_longest]


That's still not the right version.

If you compare your version to the one at wikibooks ( 
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_substring#Python 
), you'll see that the else-branch is wrongly indented (one level too 
deep). It belongs to the first if-comparison:


if S1 ...
 M[x][y] ...
 if M[x][y] ...
...
else: ...



if __name__=="__main__":

 a=open("atom-pair_4.txt","r").readline().strip()

 b=open("atom-pair_8.txt","r").readline().strip()

 
print(LongestCommonSubstring(LongestCommonSubstring(a,a),LongestCommonSubstring(b,b)))

^^^
??? What do you try to accomplish here ???
You call "LongestCommonSubstring" with identical strings, thus the 
result must be the same string.

Why not

print(LongestCommonSubstring(a, b))

as you did the line below with "c" and "d"?

Further more I think that your problems start with bad data files. In 
every file there is just one very long line which looks like a string 
representation of a list of two-digits strings. This complicates further 
processing because you have to deal with all the unnecessary commas, 
blanks and single quotes between your numbers and the square brackets at 
the beginning and the end of the line.




  $ python3 LongestCommonSubstring.py
2189
['
['82']

The results are wrong.
c, d are the string from file atom-pair_4,txt, exactly the same as a,
d is the same as b.

and even for (c,d) results are not correct, visually we can see some
similar groups, not mention the longest groups.


And even if you use the correct function from wikibooks I can anticipate 
another problem :-)
The implementation from wikibooks just returns the first common 
substring which it finds in the first string:


>>> WikibooksLongestCommonSubstring("ABAB","BABA")
'ABA'
>>> WikibooksLongestCommonSubstring("BABA", "ABAB")
'BAB'

If there are more possible substrings with the same length (as in the 
example above) only the first one is returned.


But in your example there are at least two different pathways (I've 
found three) which have the same length, as changing the order of the 
parameters will show you:


>>> WikibooksLongestCommonSubstring(c, d)
['61', '70', '61']
>>> WikibooksLongestCommonSubstring(d, c)
['83', '61', '83']

Bye, Andreas

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-10 Thread lina


#!/usr/bin/python3

import os.path

xrange = range


c=['71', '82', '80', '70', '84', '56', '58', '34', '77', '76', '61',
'76', '34', '76', '58', '34', '56', '61', '65', '82', '65', '80',
'65', '82', '80', '82', '65', '82', '61', '80', '82', '65', '61',
'63', '65', '70', '80', '71', '34', '71', '64', '34', '58', '61',
'80', '34', '40', '72', '38', '4', '70', '72', '40', '72', '4', '72',
'42', '69', '40', '70', '40', '61', '40', '34', '61', '33', '34',
'61', '34', '35', '61', '35', '61', '70', '61', '34', '61', '34',
'54', '34', '32', '35', '59', '55', '59', '34', '43', '32', '34',
'32', '24', '34', '32', '35', '32', '43', '34', '32', '34', '45',
'35', '32', '83', '61', '58', '32', '58', '83', '32', '34', '61',
'52', '34', '32', '34', '84', '32', '52', '34', '57', '34', '52',
'20', '58', '34', '32', '34', '58', '34', '58', '61', '34', '30',
'35', '28', '52', '22', '21', '22', '30', '61', '79', '70', '80',
'70', '65', '61', '80', '59', '52', '61', '20', '30', '20', '58',
'20', '29', '74', '58', '20', '31', '20', '31', '57', '31', '34',
'20', '58', '34', '52', '34', '20', '58', '83', '58', '34', '61',
'34', '32', '76', '34', '35', '52', '77', '76', '74', '76', '58',
'20', '57', '58', '33', '76', '58', '52', '74', '20', '36', '61',
'36', '74', '61', '36', '83', '61', '83', '31', '61', '59', '33',
'36', '61', '20', '34', '84', '70', '61', '36', '61', '36', '77',
'20', '38', '36', '61', '59', '38', '10', '38', '36', '38', '77',
'36', '39', '38', '36', '23', '26', '8', '36', '8', '19', '8', '19',
'8', '19', '20', '8', '36', '34', '8', '21', '8', '28', '22', '18',
'10', '20', '76', '36', '57', '20', '26', '10', '20', '28', '33',
'35', '36', '34', '36', '20', '34', '10', '36', '76', '57', '76',
'57', '16', '10', '59', '20', '19', '59', '20', '28', '20', '37',
'23', '38', '21', '23', '79', '32', '29', '36', '29', '31', '29',
'36', '20', '34', '79', '23', '20', '28', '20', '79', '74', '34',
'20', '59', '32', '20', '23', '28', '20', '10', '56', '22', '56',
'52', '57', '28', '76', '74', '20', '34', '77', '20', '36', '22',
'61', '59', '22', '20', '22', '21', '23', '20', '61', '59', '77',
'22', '34', '58', '20', '34', '28', '29', '22', '8', '22', '23', '20',
'59', '22', '20', '57', '20', '57', '22', '77', '20', '76', '36',
'20', '77', '23', '35', '77', '20', '8', '74', '10', '76', '20', '34',
'10', '31', '20', '33', '59', '61', '42', '41']

d=['45', '64', '13', '5', '64', '45', '13', '15', '13', '16', '10',
'7', '16', '10', '8', '16', '8', '10', '13', '64', '10', '45', '64',
'43', '64', '47', '64', '43', '64', '45', '47', '45', '15', '43',
'17', '64', '47', '64', '62', '75', '16', '60', '45', '64', '13',
'64', '75', '45', '47', '64', '75', '64', '60', '64', '60', '64',
'58', '60', '64', '45', '16', '64', '58', '16', '58', '60', '64', '7',
'60', '64', '7', '64', '47', '10', '64', '58', '64', '60', '58', '64',
'58', '75', '60', '64', '45', '64', '45', '58', '45', '60', '64',
'58', '64', '45', '60', '58', '75', '58', '75', '45', '60', '58',
'60', '58', '7', '13', '58', '49', '57', '64', '49', '63', '50', '63',
'49', '50', '81', '61', '49', '69', '70', '49', '39', '48', '83',
'29', '52', '39', '29', '52', '37', '52', '29', '52', '27', '83',
'52', '83', '52', '39', '27', '39', '27', '39', '41', '27', '29',
'39', '27', '83', '29', '39', '27', '29', '41', '39', '61', '28',
'41', '81', '28', '41', '28', '41', '81', '36', '51', '61', '59',
'53', '48', '53', '83', '59', '48', '59', '53', '57', '41', '83',
'61', '42', '81', '61', '40', '79', '41', '28', '59', '27', '33',
'28', '41', '83', '79', '81', '41', '61', '29', '39', '28', '61',
'39', '28', '42', '41', '31', '41', '84', '82', '84', '61', '31',
'41', '61', '41', '82', '28', '41', '57', '48', '59', '83', '48',
'83', '48', '57', '61', '57', '83', '42', '48', '61', '46', '48',
'51', '59', '51', '81', '51', '57', '51', '81', '51', '57', '48',
'59', '48', '83', '61', '83', '48', '81', '60', '48', '51', '48',
'57', '48', '51', '74', '53', '51', '53', '51', '81', '52', '51',
'61', '51', '41', '61', '83', '81', '83', '61', '81', '39', '28',
'41', '84', '42', '61', '36', '61', '63', '84', '83', '41', '72',
'41', '37', '39', '41', '82', '41', '61', '28', '39', '28', '41',
'39', '28', '41', '83', '41', '83', '61', '84', '83', '84', '83',
'51', '61', '83', '40', '83', '63', '61', '59', '28', '84', '42',
'28', '84', '61', '40', '41', '40', '41', '63', '84', '63', '59',
'83', '61', '59', '61', '39', '84', '72', '61', '40', '84', '61',
'83', '42', '59', '36', '40', '61', '63', '61', '59', '61', '40',
'29', '61', '29', '61', '39', '61', '31', '61', '70', '61']

def LongestCommonSubstring(S1, S2):
M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))] ## creat 4*5 matrix
longest, x_longest = 0, 0
for x in xrange(1,1+len(S1)): ## read each row
for y in xrange(1,1+len(S2)): ## read each coloumn
if S1[x-1] == S2[y-1]:
M[x][y] = M[x-1][y-1]+1
if M[x][y] > longest:
longest = M[x][y]

Re: [Tutor] longest common substring

2011-11-10 Thread lina
On Fri, Nov 11, 2011 at 1:23 AM, Walter Prins  wrote:
> Hi,
>
> On 10 November 2011 16:23, lina  wrote:
>>
>> def LongestCommonSubstring(S1, S2):
>>        M = [[0]*(1+len(S2)) for i in range(1+len(S1))]
>>        longest, x_longest = 0, 0
>>        for x in range(1,1+len(S1)):
>>                for y in range(1,1+len(S2)):
>>                        M[x][y] = M[x-1][y-1]+1
>>                        if M[x][y] > longest:
>>                                longest = M[x][y]
>>                                x_longest = x
>>                        else:
>>                                M[x][y] = 0
>>        return S1[x_longest-longest:x_longest]
>
> This is not the same as the implementation given on wikibooks Have you
> tried reverting your changes and using the coe that was given on the site
> exactly as is?  (I assume not, and if so, why not?)
I used python3, it showed me NameError: name 'xrange' is not defined
so I made a little changes, before I even worried I might forget to
import some modules to make the xrange work.
>
> (Specifically, I notice most the likely culprit is a missing if statement
> just below the "for y in range..." line that's been deleted)
Thanks for that. adding this missing line, works. I am still lack of
understanding how the code works, so made above mistake.
>
>>
>> The results isn't right.
>
> Yes.  You appear to have introduced a bug by not using the same code as what
> was given on the wiki page.  (Why did you modify the code and then when the
> modified code didn't work assume the original solution was broken instead of
> checking first and/or suspecting that your changes may have broken it?)
Sorry. I did not assume the original code was broken, might a little
unconsciously worry it might be out of date at that time.
I checked by eyes, bad, and did not check carefully.

Thanks with best regards,


>
> Walte
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-10 Thread lina
On Fri, Nov 11, 2011 at 1:21 AM, Peter Otten <__pete...@web.de> wrote:
> lina wrote:
>
>> Hi,
>>
>> I tested the one from
>>
> http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_substring
>>
>> mainly:
>>
>> #!/usr/bin/python3
>>
>> a=['1','2','3','7']
>>
>> b=['2','3','7']
>>
>> def LongestCommonSubstring(S1, S2):
>>         M = [[0]*(1+len(S2)) for i in range(1+len(S1))]
>>         longest, x_longest = 0, 0
>>         for x in range(1,1+len(S1)):
>>                 for y in range(1,1+len(S2)):
>>                         M[x][y] = M[x-1][y-1]+1
>>                         if M[x][y] > longest:
>>                                 longest = M[x][y]
>>                                 x_longest = x
>>                         else:
>>                                 M[x][y] = 0
>>         return S1[x_longest-longest:x_longest]
>>
>>
>> if __name__=="__main__":
>>         print(LongestCommonSubstring(a,b))
>>
>> $ python3 LongestCommonSubstring.py
>> ['1', '2', '3']
>>
>> The results isn't right.
>
> That's not the code from the site you quote. You messed it up when you tried
> to convert it to Python 3 (look for the suspicious 8-space indent)
>
You are right.
also correct it to 4-space indention now. Thanks.
> Hint: the function doesn't contain anything specific to Python 2 or 3, apart
> from the xrange builtin. If you add the line
>
> xrange = range

Thanks, I did not realize I could substitute the xrange to range this way. cool.

>
> to your code the unaltered version will run in Python 3 -- and produce the
> correct result:
>
> $ cat longest_common_substring3.py
> xrange = range
>
> def LongestCommonSubstring(S1, S2):
>    M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))]
>    longest, x_longest = 0, 0
>    for x in xrange(1,1+len(S1)):
>        for y in xrange(1,1+len(S2)):
>            if S1[x-1] == S2[y-1]:

I did not understand the code well, and the  if S1[x-1] == S2[y-1]:
was missing during I was typing ( I did not copy/paste, try to type to
enhance the learning)

>                M[x][y] = M[x-1][y-1] + 1
>                if M[x][y]>longest:
>                    longest = M[x][y]
>                    x_longest  = x
>            else:
>                M[x][y] = 0
>    return S1[x_longest-longest: x_longest]
>
> if __name__ == "__main__":
>    a = ['1','2','3','7']
>    b = ['2','3','7']
>
>    print(LongestCommonSubstring(a, b))
> $ python3 longest_common_substring3.py
> ['2', '3', '7']
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-10 Thread Walter Prins
Hi,

On 10 November 2011 16:23, lina  wrote:

> def LongestCommonSubstring(S1, S2):
>M = [[0]*(1+len(S2)) for i in range(1+len(S1))]
>longest, x_longest = 0, 0
>for x in range(1,1+len(S1)):
>for y in range(1,1+len(S2)):
>M[x][y] = M[x-1][y-1]+1
>if M[x][y] > longest:
>longest = M[x][y]
>x_longest = x
>else:
>M[x][y] = 0
>return S1[x_longest-longest:x_longest]
>

This is not the same as the implementation given on wikibooks Have you
tried reverting your changes and using the coe that was given on the site
exactly as is?  (I assume not, and if so, why not?)

(Specifically, I notice most the likely culprit is a missing if statement
just below the "for y in range..." line that's been deleted)


> The results isn't right.
>

Yes.  You appear to have introduced a bug by not using the same code as
what was given on the wiki page.  (Why did you modify the code and then
when the modified code didn't work assume the original solution was broken
instead of checking first and/or suspecting that your changes may have
broken it?)

Walter
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] longest common substring

2011-11-10 Thread Peter Otten
lina wrote:

> Hi,
> 
> I tested the one from
> 
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_substring
> 
> mainly:
> 
> #!/usr/bin/python3
> 
> a=['1','2','3','7']
> 
> b=['2','3','7']
> 
> def LongestCommonSubstring(S1, S2):
> M = [[0]*(1+len(S2)) for i in range(1+len(S1))]
> longest, x_longest = 0, 0
> for x in range(1,1+len(S1)):
> for y in range(1,1+len(S2)):
> M[x][y] = M[x-1][y-1]+1
> if M[x][y] > longest:
> longest = M[x][y]
> x_longest = x
> else:
> M[x][y] = 0
> return S1[x_longest-longest:x_longest]
> 
> 
> if __name__=="__main__":
> print(LongestCommonSubstring(a,b))
> 
> $ python3 LongestCommonSubstring.py
> ['1', '2', '3']
> 
> The results isn't right.

That's not the code from the site you quote. You messed it up when you tried 
to convert it to Python 3 (look for the suspicious 8-space indent)

Hint: the function doesn't contain anything specific to Python 2 or 3, apart 
from the xrange builtin. If you add the line

xrange = range

to your code the unaltered version will run in Python 3 -- and produce the 
correct result:

$ cat longest_common_substring3.py
xrange = range

def LongestCommonSubstring(S1, S2):
M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))]
longest, x_longest = 0, 0
for x in xrange(1,1+len(S1)):
for y in xrange(1,1+len(S2)):
if S1[x-1] == S2[y-1]:
M[x][y] = M[x-1][y-1] + 1
if M[x][y]>longest:
longest = M[x][y]
x_longest  = x
else:
M[x][y] = 0
return S1[x_longest-longest: x_longest]

if __name__ == "__main__":
a = ['1','2','3','7']
b = ['2','3','7']

print(LongestCommonSubstring(a, b))
$ python3 longest_common_substring3.py
['2', '3', '7']


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] longest common substring

2011-11-10 Thread lina
Hi,

I tested the one from
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_substring

mainly:

#!/usr/bin/python3

a=['1','2','3','7']

b=['2','3','7']

def LongestCommonSubstring(S1, S2):
M = [[0]*(1+len(S2)) for i in range(1+len(S1))]
longest, x_longest = 0, 0
for x in range(1,1+len(S1)):
for y in range(1,1+len(S2)):
M[x][y] = M[x-1][y-1]+1
if M[x][y] > longest:
longest = M[x][y]
x_longest = x
else:
M[x][y] = 0
return S1[x_longest-longest:x_longest]


if __name__=="__main__":
print(LongestCommonSubstring(a,b))

$ python3 LongestCommonSubstring.py
['1', '2', '3']

The results isn't right.

Thanks for your suggestions and comments,

Best regards,
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor