Re: Problem with list.insert

2008-08-29 Thread John Machin
On Aug 29, 5:10 pm, SUBHABRATA <[EMAIL PROTECTED]> wrote:
> Dear group,
> Thanx for your idea to use dictionary instead of a list. Your code is
> more or less, OK, some problems are there, I'll debug them. Well, I
> feel the insert problem is coming because of the Hindi thing.

It's nothing to do with the Hindi thing. Quite simply, you are
inserting into the list over which you are iterating; this is the
"a16" in the first and last lines in the following snippet from your
code. The result of doing such a thing (in general, mutating a
container that is being iterated over) is not defined and can cause
all sorts of problems. It can be avoided by iterating over a copy of
the container that you want to change. However I suggest that you
seriously look at what you are actually trying to achieve, and rewrite
it.

 for word in a16:
#MATCHING WITH GIVEN STRING
a17=a2.find(word)
if a17>-1:
print "The word is found in the Source String"
a18=a3.index(word)
a19=a3[a18]
print a19
#INSERTING IN THE LIST OF TARGET STRING
a20=a16.insert(a18,a19)

This code has several problems:
 if a8 in a5:
a9=a5.index(a8)
a10=a5[a9:]
a11=re.search("\xe0.*?\n",a10)
a12=a11.group()
a13=a12[:-1]
found.append(a13)
elif a8 not in a5:
a14=x
not_found.append(a14)
else:
print "Error"
found.extend(not_found)

(1) If you ever execute that print statement, it means that the end of
the universe is nigh -- throw away the else part and replace "elif a8
not in a5" with "else".

(2) The statement "found.extend(not_found)" is emitting a very foul
aroma. Your "found" list ends up with the translated words followed by
the untranslated words -- this is not very useful and you then have to
write some weird code to try to untangle it; just build your desired
output as you step through the words to be translated.

(3) Your "dictionary" is implemented as a string of the whole
dictionary contents -- you are linearly searching a long string for
each input word. You should load your dictionary file into a Python
dictionary, and load it *once* at the start of your program, not once
per input sentence.

> And Python2.5 is supporting Hindi quite fluently.

Python supports any 8-bit encoding to the extent that the platform's
console can display the characters correctly. What is the '\xe0'? The
PC-ISCII ATR character?

Cheers,
John
--
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with list.insert

2008-08-29 Thread SUBHABRATA
Dear group,
Thanx for your idea to use dictionary instead of a list. Your code is
more or less, OK, some problems are there, I'll debug them. Well, I
feel the insert problem is coming because of the Hindi thing.
And Python2.5 is supporting Hindi quite fluently.
I am writing in Python2.5.1.
Best Regards,
Subhabrata.

Terry Reedy wrote:
> SUBHABRATA, I recommend you study this excellent response carefully.
>
> castironpi wrote:
> > On Aug 28, 11:13 am, SUBHABRATA <[EMAIL PROTECTED]> wrote:
> >-.
> >
> > Instead split up your inputs first thing.
> >
> > trans= { 'a': 'A', 'at': 'AT', 'to': 'TO' }
> > sample= 'a boy at the park walked to the tree'
> > expected= 'A boy AT the park walked TO the tree'
>
> It starts with a concrete test case -- an 'executable problem
> statement'.  To me, this is cleared and more useful than the 20 lines of
> prose you used.  A single line English statement would be "Problem:
> Replace selected words in a text using a dictionary."  Sometimes, less
> (words) really is more (understanding).
>
> If the above is *not* what you meant, then give a similarly concrete
> example that does what you *do* mean.
>
> > sample_list= sample.split( )
> > for i, x in enumerate( sample_list ):
> > if x in trans:
> > sample_list[ i ]= trans[ x ]
>
> Meaningful names make the code easy to understand.  Meaningless numbered
> 'a's require each reader to create meaningful names and associate them
> in his/her head.  But that is part of the job of the programmer.
>
> > result= ' '.join( sample_list )
> > print result
> > assert result== expected
>
> It ends with an automated test that is easy to rerun should the code in
> between need to be modified.  Assert only prints something if there is
> an error.  With numerous tests, that is what one often wants.  But with
> only one, your might prefer 'print' instead of 'assert' to get a more
> reassuring and satisfying 'True' printed.
>
> > Then replace them as you visit each one, and join them later.
>
> If you are using Hindi characters, you might want to use Python3 when it
> arrives, since it will use Unicode strings as the (default) string type.
>   But for posting here, stick with the ascii subset.
>
> Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with list.insert

2008-08-28 Thread Terry Reedy


SUBHABRATA, I recommend you study this excellent response carefully.

castironpi wrote:

On Aug 28, 11:13 am, SUBHABRATA <[EMAIL PROTECTED]> wrote:
-.

Instead split up your inputs first thing.

trans= { 'a': 'A', 'at': 'AT', 'to': 'TO' }
sample= 'a boy at the park walked to the tree'
expected= 'A boy AT the park walked TO the tree'


It starts with a concrete test case -- an 'executable problem 
statement'.  To me, this is cleared and more useful than the 20 lines of 
prose you used.  A single line English statement would be "Problem: 
Replace selected words in a text using a dictionary."  Sometimes, less 
(words) really is more (understanding).


If the above is *not* what you meant, then give a similarly concrete 
example that does what you *do* mean.



sample_list= sample.split( )
for i, x in enumerate( sample_list ):
if x in trans:
sample_list[ i ]= trans[ x ]


Meaningful names make the code easy to understand.  Meaningless numbered 
'a's require each reader to create meaningful names and associate them 
in his/her head.  But that is part of the job of the programmer.



result= ' '.join( sample_list )
print result
assert result== expected


It ends with an automated test that is easy to rerun should the code in 
between need to be modified.  Assert only prints something if there is 
an error.  With numerous tests, that is what one often wants.  But with 
only one, your might prefer 'print' instead of 'assert' to get a more 
reassuring and satisfying 'True' printed.



Then replace them as you visit each one, and join them later.


If you are using Hindi characters, you might want to use Python3 when it 
arrives, since it will use Unicode strings as the (default) string type. 
 But for posting here, stick with the ascii subset.


Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with list.insert

2008-08-28 Thread castironpi
On Aug 28, 11:13 am, SUBHABRATA <[EMAIL PROTECTED]> wrote:
> Dear Group,
> I wrote one program,
> There is a dictionary.
> There is an input string.
> Every word of input string the word is matched against the dictionary
> If the word of input string is matched against the dictionary it gives
> the word of the dictionary.
> But if it does not find it gives the original word.
> After searching the words are joined back.
> But as I am joining I am finding the words which are not available in
> dictionary are printed in the last even if the word is given in the
> first/middle.
> Now, I want to take them in order.
> I am applying a thumb rule that the position of the word of the string
> is exact with the resultant string.
> So, I am determining the word which is not in the dictionary, and its
> position in the input string.
> Now I am inserting it in the target string, for this I am splitting
> both the given string and the output/result string.
> Till now it is working fine.
> But a problem happening is that if I insert it it is inserting same
> words multiple times and the program seems to be an unending process.
> What is the error happening?
> If any one can suggest.
> The code is given below:

Warning, -spoiler-.

Instead split up your inputs first thing.

trans= { 'a': 'A', 'at': 'AT', 'to': 'TO' }
sample= 'a boy at the park walked to the tree'
expected= 'A boy AT the park walked TO the tree'

sample_list= sample.split( )
for i, x in enumerate( sample_list ):
if x in trans:
sample_list[ i ]= trans[ x ]

result= ' '.join( sample_list )
print result
assert result== expected

Then replace them as you visit each one, and join them later.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with list.insert

2008-08-28 Thread bearophileHUGS
Subhabrata, it's very difficult for me to understand what your short
program has to do, or what you say. I think that formatting and code
style are important.

So I suggest you to give meaningful names to all your variable names,
to remove unused variables (like n), to add blank likes here and there
to separate logically separated parts of your program, or even better
to split it into functions. You can remove some intermediate function,
coalescing few logically related operations into a line, you can put
spaces around operators like = and after a commas, you can show an
usage example in English, so people can understand what the program is
supposed to to, you can avoid joining and then splitting strings
again, remove useless () around certain things.

This is a possible re-write of the first part of your code, it's not
exactly equal...


def input_words():
input_message = "Print one English sentence for dictionary check:
"
return raw_input(input_message).lower().split()


def load_dictionary():
return set(line.rstrip() for line in open("words.txt"))


def dictionary_search(dictionary, words):
found = []
not_found = []

for word in words:
if word in dictionary:
found.append(word)
else:
not_found.append(word)

return found + not_found


inwords = input_words()
dictionary = load_dictionary()
print dictionary_search(dictionary, inwords)


It's far from perfect, but you can use it as starting point for a
rewrite of your whole program.

Bye,
bearophile
--
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with list.insert

2008-08-28 Thread Diez B. Roggisch

Diez B. Roggisch schrieb:

SUBHABRATA schrieb:

Some people in the room told I am kidding, but I learnt Python from
Python docs which gives examples like these,
But I write explicit comments,
an excerpt from python docs:
# Measure some strings:
... a = ['cat', 'window', 'defenestrate']

for x in a:

... print x, len(x)
...
cat 3
window 6
defenestrate 12
But well, if you are suggesting improvement I'll surely listen.


Please! Just because a tiny 3 lines example involing just *one* list 
doesn't give that a long & speaking name does not mean


discard my last post - I accidentially pressed submit to early.

Numbering variable names surely is *not* found in any python example. 
Short names, as the examples are clear & don't require more meaningful 
names occur, yes. But nowhere you will find 2-figure enumerations.


Each book or tutorial about programming will teach you to use meaningful 
variables for your program.


As far as your explanation goes: there is *nothing* to be understood 
from a bunch of questionmarks + sometimes "lincoln" spread in between is 
not really helping.


This is most probably not your fault, as somehow the hindi get's twisted 
to the questionmarks - however, I suggest you provide an example where 
the hindi is replaced with english words (translations, or placeholders) 
- otherwise, you won't be understood, and can't be helped.


Diez
--
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with list.insert

2008-08-28 Thread Diez B. Roggisch

SUBHABRATA schrieb:

Some people in the room told I am kidding, but I learnt Python from
Python docs which gives examples like these,
But I write explicit comments,
an excerpt from python docs:
# Measure some strings:
... a = ['cat', 'window', 'defenestrate']

for x in a:

... print x, len(x)
...
cat 3
window 6
defenestrate 12
But well, if you are suggesting improvement I'll surely listen.


Please! Just because a tiny 3 lines example involing just *one* list 
doesn't give that a long & speaking name does not mean



The outputs are given in Hindi, it is a dictionary look up program,
the matching words are in Hindi, you may leave aside them.
How to debug the result string is to see the words which are in
English as the group page does not take italics so I am putting one
asterisk* after it
NO PROBLEM:
INPUT:
he has come
OUTPUT IS
उओह/ उन्होने रहेसाक्ता २.यात्राकरना
PROBLEM:
INPUT:
(i) Lincoln* has come
OUTPUT IS:
रहेसाक्ता २.यात्राकरना lincoln*
lincoln lincoln* रहेसाक्ता २.यात्राकरना lincoln
lincoln lincoln* lincoln* रहेसाक्ता २.यात्राकरना lincoln
….and increasing the number and seems a never ending process.
MY EXPEPECTED STRING IS:
lincoln रहेसाक्ता २.यात्राकरना lincoln^
The latter places marked^ I am editing don't worry for that,
though MY FINAL EXPECTED STRING IS:
lincoln रहेसाक्ता २.यात्राकरना
Best Regards,
Subhabrata.



Marc 'BlackJack' Rintsch wrote:

On Thu, 28 Aug 2008 09:13:00 -0700, SUBHABRATA wrote:


import re
def wordchecker1(n):
# INPUTTING STRING
a1=raw_input("PRINT ONE ENGLISH SENTENCE FOR DICTIONARY CHECK:")
#CONVERTING TO LOWER CASE
a2=a1.lower()
#CONVERTING INTO LIST
a3=a2.split()
#DICTIONARY
a4=open("/python25/Changedict3.txt","r") a5=a4.read()
a6=a5.split()
found=[]
not_found=[]
   #SEARCHING DICTIONARY
for x in a3:
a7="\n"
a8=a7+x
if a8 in a5:
a9=a5.index(a8)
a10=a5[a9:]
a11=re.search("\xe0.*?\n",a10)
a12=a11.group()
a13=a12[:-1]
found.append(a13)
elif a8 not in a5:
a14=x
not_found.append(a14)
else:
print "Error"
found.extend(not_found)
# THE OUTPUT
print "OUTPUT STRING IS"
a15=(' '.join(found))
#THE OUTPUT STRING
print a15
# SPLITTING OUTPUT STRING IN WORDS
a16=a15.split()
#TAKING OUT THE WORD FROM OUTPUT STRING for word in a16:
#MATCHING WITH GIVEN STRING
a17=a2.find(word)
if a17>-1:
print "The word is found in the Source String"
a18=a3.index(word)
a19=a3[a18]
print a19
#INSERTING IN THE LIST OF TARGET STRING
a20=a16.insert(a18,a19)
print a16
a21=(" ".join(a16))
print a21

a1, a2, a2, …, a20?  You must be kidding.  Please stop numbering names
and use *meaningful* names instead!

Could you describe them problem better, with sample inputs and expected
outputs.  There must be a better way that that unreadable mess above.

Ciao,
Marc 'BlackJack' Rintsch

--
http://mail.python.org/mailman/listinfo/python-list

Re: Problem with list.insert

2008-08-28 Thread SUBHABRATA
Some people in the room told I am kidding, but I learnt Python from
Python docs which gives examples like these,
But I write explicit comments,
an excerpt from python docs:
# Measure some strings:
... a = ['cat', 'window', 'defenestrate']
>>> for x in a:
... print x, len(x)
...
cat 3
window 6
defenestrate 12
But well, if you are suggesting improvement I'll surely listen.

The outputs are given in Hindi, it is a dictionary look up program,
the matching words are in Hindi, you may leave aside them.
How to debug the result string is to see the words which are in
English as the group page does not take italics so I am putting one
asterisk* after it
NO PROBLEM:
INPUT:
he has come
OUTPUT IS
उओह/ उन्होने रहेसाक्ता २.यात्राकरना
PROBLEM:
INPUT:
(i) Lincoln* has come
OUTPUT IS:
रहेसाक्ता २.यात्राकरना lincoln*
lincoln lincoln* रहेसाक्ता २.यात्राकरना lincoln
lincoln lincoln* lincoln* रहेसाक्ता २.यात्राकरना lincoln
….and increasing the number and seems a never ending process.
MY EXPEPECTED STRING IS:
lincoln रहेसाक्ता २.यात्राकरना lincoln^
The latter places marked^ I am editing don't worry for that,
though MY FINAL EXPECTED STRING IS:
lincoln रहेसाक्ता २.यात्राकरना
Best Regards,
Subhabrata.



Marc 'BlackJack' Rintsch wrote:
> On Thu, 28 Aug 2008 09:13:00 -0700, SUBHABRATA wrote:
>
> > import re
> > def wordchecker1(n):
> > # INPUTTING STRING
> > a1=raw_input("PRINT ONE ENGLISH SENTENCE FOR DICTIONARY CHECK:")
> > #CONVERTING TO LOWER CASE
> > a2=a1.lower()
> > #CONVERTING INTO LIST
> > a3=a2.split()
> > #DICTIONARY
> > a4=open("/python25/Changedict3.txt","r") a5=a4.read()
> > a6=a5.split()
> > found=[]
> > not_found=[]
> >#SEARCHING DICTIONARY
> > for x in a3:
> > a7="\n"
> > a8=a7+x
> > if a8 in a5:
> > a9=a5.index(a8)
> > a10=a5[a9:]
> > a11=re.search("\xe0.*?\n",a10)
> > a12=a11.group()
> > a13=a12[:-1]
> > found.append(a13)
> > elif a8 not in a5:
> > a14=x
> > not_found.append(a14)
> > else:
> > print "Error"
> > found.extend(not_found)
> > # THE OUTPUT
> > print "OUTPUT STRING IS"
> > a15=(' '.join(found))
> > #THE OUTPUT STRING
> > print a15
> > # SPLITTING OUTPUT STRING IN WORDS
> > a16=a15.split()
> > #TAKING OUT THE WORD FROM OUTPUT STRING for word in a16:
> > #MATCHING WITH GIVEN STRING
> > a17=a2.find(word)
> > if a17>-1:
> > print "The word is found in the Source String"
> > a18=a3.index(word)
> > a19=a3[a18]
> > print a19
> > #INSERTING IN THE LIST OF TARGET STRING
> > a20=a16.insert(a18,a19)
> > print a16
> > a21=(" ".join(a16))
> > print a21
>
> a1, a2, a2, …, a20?  You must be kidding.  Please stop numbering names
> and use *meaningful* names instead!
>
> Could you describe them problem better, with sample inputs and expected
> outputs.  There must be a better way that that unreadable mess above.
>
> Ciao,
>   Marc 'BlackJack' Rintsch
--
http://mail.python.org/mailman/listinfo/python-list

Re: Problem with list.insert

2008-08-28 Thread Marc 'BlackJack' Rintsch
On Thu, 28 Aug 2008 09:13:00 -0700, SUBHABRATA wrote:

> import re
> def wordchecker1(n):
> # INPUTTING STRING
> a1=raw_input("PRINT ONE ENGLISH SENTENCE FOR DICTIONARY CHECK:")
> #CONVERTING TO LOWER CASE
> a2=a1.lower()
> #CONVERTING INTO LIST
> a3=a2.split()
> #DICTIONARY
> a4=open("/python25/Changedict3.txt","r") a5=a4.read()
> a6=a5.split()
> found=[]
> not_found=[]
>#SEARCHING DICTIONARY
> for x in a3:
> a7="\n"
> a8=a7+x
> if a8 in a5:
> a9=a5.index(a8)
> a10=a5[a9:]
> a11=re.search("\xe0.*?\n",a10)
> a12=a11.group()
> a13=a12[:-1]
> found.append(a13)
> elif a8 not in a5:
> a14=x
> not_found.append(a14)
> else:
> print "Error"
> found.extend(not_found)
> # THE OUTPUT
> print "OUTPUT STRING IS"
> a15=(' '.join(found))
> #THE OUTPUT STRING
> print a15
> # SPLITTING OUTPUT STRING IN WORDS
> a16=a15.split()
> #TAKING OUT THE WORD FROM OUTPUT STRING for word in a16:
> #MATCHING WITH GIVEN STRING
> a17=a2.find(word)
> if a17>-1:
> print "The word is found in the Source String"
> a18=a3.index(word)
> a19=a3[a18]
> print a19
> #INSERTING IN THE LIST OF TARGET STRING
> a20=a16.insert(a18,a19)
> print a16
> a21=(" ".join(a16))
> print a21

a1, a2, a2, …, a20?  You must be kidding.  Please stop numbering names 
and use *meaningful* names instead!

Could you describe them problem better, with sample inputs and expected 
outputs.  There must be a better way that that unreadable mess above.

Ciao,
Marc 'BlackJack' Rintsch
--
http://mail.python.org/mailman/listinfo/python-list

Problem with list.insert

2008-08-28 Thread SUBHABRATA
Dear Group,
I wrote one program,
There is a dictionary.
There is an input string.
Every word of input string the word is matched against the dictionary
If the word of input string is matched against the dictionary it gives
the word of the dictionary.
But if it does not find it gives the original word.
After searching the words are joined back.
But as I am joining I am finding the words which are not available in
dictionary are printed in the last even if the word is given in the
first/middle.
Now, I want to take them in order.
I am applying a thumb rule that the position of the word of the string
is exact with the resultant string.
So, I am determining the word which is not in the dictionary, and its
position in the input string.
Now I am inserting it in the target string, for this I am splitting
both the given string and the output/result string.
Till now it is working fine.
But a problem happening is that if I insert it it is inserting same
words multiple times and the program seems to be an unending process.
What is the error happening?
If any one can suggest.
The code is given below:
import re
def wordchecker1(n):
# INPUTTING STRING
a1=raw_input("PRINT ONE ENGLISH SENTENCE FOR DICTIONARY CHECK:")
#CONVERTING TO LOWER CASE
a2=a1.lower()
#CONVERTING INTO LIST
a3=a2.split()
#DICTIONARY
a4=open("/python25/Changedict3.txt","r")
a5=a4.read()
a6=a5.split()
found=[]
not_found=[]
   #SEARCHING DICTIONARY
for x in a3:
a7="\n"
a8=a7+x
if a8 in a5:
a9=a5.index(a8)
a10=a5[a9:]
a11=re.search("\xe0.*?\n",a10)
a12=a11.group()
a13=a12[:-1]
found.append(a13)
elif a8 not in a5:
a14=x
not_found.append(a14)
else:
print "Error"
found.extend(not_found)
# THE OUTPUT
print "OUTPUT STRING IS"
a15=(' '.join(found))
#THE OUTPUT STRING
print a15
# SPLITTING OUTPUT STRING IN WORDS
a16=a15.split()
#TAKING OUT THE WORD FROM OUTPUT STRING
for word in a16:
#MATCHING WITH GIVEN STRING
a17=a2.find(word)
if a17>-1:
print "The word is found in the Source String"
a18=a3.index(word)
a19=a3[a18]
print a19
#INSERTING IN THE LIST OF TARGET STRING
a20=a16.insert(a18,a19)
print a16
a21=(" ".join(a16))
print a21
Best Regards,
Subhabrata.

--
http://mail.python.org/mailman/listinfo/python-list