Re: Problem with list.insert
On Aug 29, 5:10 pm, SUBHABRATA <[EMAIL PROTECTED]> wrote: > Dear group, > Thanx for your idea to use dictionary instead of a list. Your code is > more or less, OK, some problems are there, I'll debug them. Well, I > feel the insert problem is coming because of the Hindi thing. It's nothing to do with the Hindi thing. Quite simply, you are inserting into the list over which you are iterating; this is the "a16" in the first and last lines in the following snippet from your code. The result of doing such a thing (in general, mutating a container that is being iterated over) is not defined and can cause all sorts of problems. It can be avoided by iterating over a copy of the container that you want to change. However I suggest that you seriously look at what you are actually trying to achieve, and rewrite it. for word in a16: #MATCHING WITH GIVEN STRING a17=a2.find(word) if a17>-1: print "The word is found in the Source String" a18=a3.index(word) a19=a3[a18] print a19 #INSERTING IN THE LIST OF TARGET STRING a20=a16.insert(a18,a19) This code has several problems: if a8 in a5: a9=a5.index(a8) a10=a5[a9:] a11=re.search("\xe0.*?\n",a10) a12=a11.group() a13=a12[:-1] found.append(a13) elif a8 not in a5: a14=x not_found.append(a14) else: print "Error" found.extend(not_found) (1) If you ever execute that print statement, it means that the end of the universe is nigh -- throw away the else part and replace "elif a8 not in a5" with "else". (2) The statement "found.extend(not_found)" is emitting a very foul aroma. Your "found" list ends up with the translated words followed by the untranslated words -- this is not very useful and you then have to write some weird code to try to untangle it; just build your desired output as you step through the words to be translated. (3) Your "dictionary" is implemented as a string of the whole dictionary contents -- you are linearly searching a long string for each input word. You should load your dictionary file into a Python dictionary, and load it *once* at the start of your program, not once per input sentence. > And Python2.5 is supporting Hindi quite fluently. Python supports any 8-bit encoding to the extent that the platform's console can display the characters correctly. What is the '\xe0'? The PC-ISCII ATR character? Cheers, John -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with list.insert
Dear group, Thanx for your idea to use dictionary instead of a list. Your code is more or less, OK, some problems are there, I'll debug them. Well, I feel the insert problem is coming because of the Hindi thing. And Python2.5 is supporting Hindi quite fluently. I am writing in Python2.5.1. Best Regards, Subhabrata. Terry Reedy wrote: > SUBHABRATA, I recommend you study this excellent response carefully. > > castironpi wrote: > > On Aug 28, 11:13 am, SUBHABRATA <[EMAIL PROTECTED]> wrote: > >-. > > > > Instead split up your inputs first thing. > > > > trans= { 'a': 'A', 'at': 'AT', 'to': 'TO' } > > sample= 'a boy at the park walked to the tree' > > expected= 'A boy AT the park walked TO the tree' > > It starts with a concrete test case -- an 'executable problem > statement'. To me, this is cleared and more useful than the 20 lines of > prose you used. A single line English statement would be "Problem: > Replace selected words in a text using a dictionary." Sometimes, less > (words) really is more (understanding). > > If the above is *not* what you meant, then give a similarly concrete > example that does what you *do* mean. > > > sample_list= sample.split( ) > > for i, x in enumerate( sample_list ): > > if x in trans: > > sample_list[ i ]= trans[ x ] > > Meaningful names make the code easy to understand. Meaningless numbered > 'a's require each reader to create meaningful names and associate them > in his/her head. But that is part of the job of the programmer. > > > result= ' '.join( sample_list ) > > print result > > assert result== expected > > It ends with an automated test that is easy to rerun should the code in > between need to be modified. Assert only prints something if there is > an error. With numerous tests, that is what one often wants. But with > only one, your might prefer 'print' instead of 'assert' to get a more > reassuring and satisfying 'True' printed. > > > Then replace them as you visit each one, and join them later. > > If you are using Hindi characters, you might want to use Python3 when it > arrives, since it will use Unicode strings as the (default) string type. > But for posting here, stick with the ascii subset. > > Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with list.insert
SUBHABRATA, I recommend you study this excellent response carefully. castironpi wrote: On Aug 28, 11:13 am, SUBHABRATA <[EMAIL PROTECTED]> wrote: -. Instead split up your inputs first thing. trans= { 'a': 'A', 'at': 'AT', 'to': 'TO' } sample= 'a boy at the park walked to the tree' expected= 'A boy AT the park walked TO the tree' It starts with a concrete test case -- an 'executable problem statement'. To me, this is cleared and more useful than the 20 lines of prose you used. A single line English statement would be "Problem: Replace selected words in a text using a dictionary." Sometimes, less (words) really is more (understanding). If the above is *not* what you meant, then give a similarly concrete example that does what you *do* mean. sample_list= sample.split( ) for i, x in enumerate( sample_list ): if x in trans: sample_list[ i ]= trans[ x ] Meaningful names make the code easy to understand. Meaningless numbered 'a's require each reader to create meaningful names and associate them in his/her head. But that is part of the job of the programmer. result= ' '.join( sample_list ) print result assert result== expected It ends with an automated test that is easy to rerun should the code in between need to be modified. Assert only prints something if there is an error. With numerous tests, that is what one often wants. But with only one, your might prefer 'print' instead of 'assert' to get a more reassuring and satisfying 'True' printed. Then replace them as you visit each one, and join them later. If you are using Hindi characters, you might want to use Python3 when it arrives, since it will use Unicode strings as the (default) string type. But for posting here, stick with the ascii subset. Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with list.insert
On Aug 28, 11:13 am, SUBHABRATA <[EMAIL PROTECTED]> wrote: > Dear Group, > I wrote one program, > There is a dictionary. > There is an input string. > Every word of input string the word is matched against the dictionary > If the word of input string is matched against the dictionary it gives > the word of the dictionary. > But if it does not find it gives the original word. > After searching the words are joined back. > But as I am joining I am finding the words which are not available in > dictionary are printed in the last even if the word is given in the > first/middle. > Now, I want to take them in order. > I am applying a thumb rule that the position of the word of the string > is exact with the resultant string. > So, I am determining the word which is not in the dictionary, and its > position in the input string. > Now I am inserting it in the target string, for this I am splitting > both the given string and the output/result string. > Till now it is working fine. > But a problem happening is that if I insert it it is inserting same > words multiple times and the program seems to be an unending process. > What is the error happening? > If any one can suggest. > The code is given below: Warning, -spoiler-. Instead split up your inputs first thing. trans= { 'a': 'A', 'at': 'AT', 'to': 'TO' } sample= 'a boy at the park walked to the tree' expected= 'A boy AT the park walked TO the tree' sample_list= sample.split( ) for i, x in enumerate( sample_list ): if x in trans: sample_list[ i ]= trans[ x ] result= ' '.join( sample_list ) print result assert result== expected Then replace them as you visit each one, and join them later. -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with list.insert
Subhabrata, it's very difficult for me to understand what your short program has to do, or what you say. I think that formatting and code style are important. So I suggest you to give meaningful names to all your variable names, to remove unused variables (like n), to add blank likes here and there to separate logically separated parts of your program, or even better to split it into functions. You can remove some intermediate function, coalescing few logically related operations into a line, you can put spaces around operators like = and after a commas, you can show an usage example in English, so people can understand what the program is supposed to to, you can avoid joining and then splitting strings again, remove useless () around certain things. This is a possible re-write of the first part of your code, it's not exactly equal... def input_words(): input_message = "Print one English sentence for dictionary check: " return raw_input(input_message).lower().split() def load_dictionary(): return set(line.rstrip() for line in open("words.txt")) def dictionary_search(dictionary, words): found = [] not_found = [] for word in words: if word in dictionary: found.append(word) else: not_found.append(word) return found + not_found inwords = input_words() dictionary = load_dictionary() print dictionary_search(dictionary, inwords) It's far from perfect, but you can use it as starting point for a rewrite of your whole program. Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with list.insert
Diez B. Roggisch schrieb: SUBHABRATA schrieb: Some people in the room told I am kidding, but I learnt Python from Python docs which gives examples like these, But I write explicit comments, an excerpt from python docs: # Measure some strings: ... a = ['cat', 'window', 'defenestrate'] for x in a: ... print x, len(x) ... cat 3 window 6 defenestrate 12 But well, if you are suggesting improvement I'll surely listen. Please! Just because a tiny 3 lines example involing just *one* list doesn't give that a long & speaking name does not mean discard my last post - I accidentially pressed submit to early. Numbering variable names surely is *not* found in any python example. Short names, as the examples are clear & don't require more meaningful names occur, yes. But nowhere you will find 2-figure enumerations. Each book or tutorial about programming will teach you to use meaningful variables for your program. As far as your explanation goes: there is *nothing* to be understood from a bunch of questionmarks + sometimes "lincoln" spread in between is not really helping. This is most probably not your fault, as somehow the hindi get's twisted to the questionmarks - however, I suggest you provide an example where the hindi is replaced with english words (translations, or placeholders) - otherwise, you won't be understood, and can't be helped. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with list.insert
SUBHABRATA schrieb: Some people in the room told I am kidding, but I learnt Python from Python docs which gives examples like these, But I write explicit comments, an excerpt from python docs: # Measure some strings: ... a = ['cat', 'window', 'defenestrate'] for x in a: ... print x, len(x) ... cat 3 window 6 defenestrate 12 But well, if you are suggesting improvement I'll surely listen. Please! Just because a tiny 3 lines example involing just *one* list doesn't give that a long & speaking name does not mean The outputs are given in Hindi, it is a dictionary look up program, the matching words are in Hindi, you may leave aside them. How to debug the result string is to see the words which are in English as the group page does not take italics so I am putting one asterisk* after it NO PROBLEM: INPUT: he has come OUTPUT IS उओह/ उन्होने रहेसाक्ता २.यात्राकरना PROBLEM: INPUT: (i) Lincoln* has come OUTPUT IS: रहेसाक्ता २.यात्राकरना lincoln* lincoln lincoln* रहेसाक्ता २.यात्राकरना lincoln lincoln lincoln* lincoln* रहेसाक्ता २.यात्राकरना lincoln ….and increasing the number and seems a never ending process. MY EXPEPECTED STRING IS: lincoln रहेसाक्ता २.यात्राकरना lincoln^ The latter places marked^ I am editing don't worry for that, though MY FINAL EXPECTED STRING IS: lincoln रहेसाक्ता २.यात्राकरना Best Regards, Subhabrata. Marc 'BlackJack' Rintsch wrote: On Thu, 28 Aug 2008 09:13:00 -0700, SUBHABRATA wrote: import re def wordchecker1(n): # INPUTTING STRING a1=raw_input("PRINT ONE ENGLISH SENTENCE FOR DICTIONARY CHECK:") #CONVERTING TO LOWER CASE a2=a1.lower() #CONVERTING INTO LIST a3=a2.split() #DICTIONARY a4=open("/python25/Changedict3.txt","r") a5=a4.read() a6=a5.split() found=[] not_found=[] #SEARCHING DICTIONARY for x in a3: a7="\n" a8=a7+x if a8 in a5: a9=a5.index(a8) a10=a5[a9:] a11=re.search("\xe0.*?\n",a10) a12=a11.group() a13=a12[:-1] found.append(a13) elif a8 not in a5: a14=x not_found.append(a14) else: print "Error" found.extend(not_found) # THE OUTPUT print "OUTPUT STRING IS" a15=(' '.join(found)) #THE OUTPUT STRING print a15 # SPLITTING OUTPUT STRING IN WORDS a16=a15.split() #TAKING OUT THE WORD FROM OUTPUT STRING for word in a16: #MATCHING WITH GIVEN STRING a17=a2.find(word) if a17>-1: print "The word is found in the Source String" a18=a3.index(word) a19=a3[a18] print a19 #INSERTING IN THE LIST OF TARGET STRING a20=a16.insert(a18,a19) print a16 a21=(" ".join(a16)) print a21 a1, a2, a2, …, a20? You must be kidding. Please stop numbering names and use *meaningful* names instead! Could you describe them problem better, with sample inputs and expected outputs. There must be a better way that that unreadable mess above. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with list.insert
Some people in the room told I am kidding, but I learnt Python from Python docs which gives examples like these, But I write explicit comments, an excerpt from python docs: # Measure some strings: ... a = ['cat', 'window', 'defenestrate'] >>> for x in a: ... print x, len(x) ... cat 3 window 6 defenestrate 12 But well, if you are suggesting improvement I'll surely listen. The outputs are given in Hindi, it is a dictionary look up program, the matching words are in Hindi, you may leave aside them. How to debug the result string is to see the words which are in English as the group page does not take italics so I am putting one asterisk* after it NO PROBLEM: INPUT: he has come OUTPUT IS उओह/ उन्होने रहेसाक्ता २.यात्राकरना PROBLEM: INPUT: (i) Lincoln* has come OUTPUT IS: रहेसाक्ता २.यात्राकरना lincoln* lincoln lincoln* रहेसाक्ता २.यात्राकरना lincoln lincoln lincoln* lincoln* रहेसाक्ता २.यात्राकरना lincoln ….and increasing the number and seems a never ending process. MY EXPEPECTED STRING IS: lincoln रहेसाक्ता २.यात्राकरना lincoln^ The latter places marked^ I am editing don't worry for that, though MY FINAL EXPECTED STRING IS: lincoln रहेसाक्ता २.यात्राकरना Best Regards, Subhabrata. Marc 'BlackJack' Rintsch wrote: > On Thu, 28 Aug 2008 09:13:00 -0700, SUBHABRATA wrote: > > > import re > > def wordchecker1(n): > > # INPUTTING STRING > > a1=raw_input("PRINT ONE ENGLISH SENTENCE FOR DICTIONARY CHECK:") > > #CONVERTING TO LOWER CASE > > a2=a1.lower() > > #CONVERTING INTO LIST > > a3=a2.split() > > #DICTIONARY > > a4=open("/python25/Changedict3.txt","r") a5=a4.read() > > a6=a5.split() > > found=[] > > not_found=[] > >#SEARCHING DICTIONARY > > for x in a3: > > a7="\n" > > a8=a7+x > > if a8 in a5: > > a9=a5.index(a8) > > a10=a5[a9:] > > a11=re.search("\xe0.*?\n",a10) > > a12=a11.group() > > a13=a12[:-1] > > found.append(a13) > > elif a8 not in a5: > > a14=x > > not_found.append(a14) > > else: > > print "Error" > > found.extend(not_found) > > # THE OUTPUT > > print "OUTPUT STRING IS" > > a15=(' '.join(found)) > > #THE OUTPUT STRING > > print a15 > > # SPLITTING OUTPUT STRING IN WORDS > > a16=a15.split() > > #TAKING OUT THE WORD FROM OUTPUT STRING for word in a16: > > #MATCHING WITH GIVEN STRING > > a17=a2.find(word) > > if a17>-1: > > print "The word is found in the Source String" > > a18=a3.index(word) > > a19=a3[a18] > > print a19 > > #INSERTING IN THE LIST OF TARGET STRING > > a20=a16.insert(a18,a19) > > print a16 > > a21=(" ".join(a16)) > > print a21 > > a1, a2, a2, …, a20? You must be kidding. Please stop numbering names > and use *meaningful* names instead! > > Could you describe them problem better, with sample inputs and expected > outputs. There must be a better way that that unreadable mess above. > > Ciao, > Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with list.insert
On Thu, 28 Aug 2008 09:13:00 -0700, SUBHABRATA wrote: > import re > def wordchecker1(n): > # INPUTTING STRING > a1=raw_input("PRINT ONE ENGLISH SENTENCE FOR DICTIONARY CHECK:") > #CONVERTING TO LOWER CASE > a2=a1.lower() > #CONVERTING INTO LIST > a3=a2.split() > #DICTIONARY > a4=open("/python25/Changedict3.txt","r") a5=a4.read() > a6=a5.split() > found=[] > not_found=[] >#SEARCHING DICTIONARY > for x in a3: > a7="\n" > a8=a7+x > if a8 in a5: > a9=a5.index(a8) > a10=a5[a9:] > a11=re.search("\xe0.*?\n",a10) > a12=a11.group() > a13=a12[:-1] > found.append(a13) > elif a8 not in a5: > a14=x > not_found.append(a14) > else: > print "Error" > found.extend(not_found) > # THE OUTPUT > print "OUTPUT STRING IS" > a15=(' '.join(found)) > #THE OUTPUT STRING > print a15 > # SPLITTING OUTPUT STRING IN WORDS > a16=a15.split() > #TAKING OUT THE WORD FROM OUTPUT STRING for word in a16: > #MATCHING WITH GIVEN STRING > a17=a2.find(word) > if a17>-1: > print "The word is found in the Source String" > a18=a3.index(word) > a19=a3[a18] > print a19 > #INSERTING IN THE LIST OF TARGET STRING > a20=a16.insert(a18,a19) > print a16 > a21=(" ".join(a16)) > print a21 a1, a2, a2, …, a20? You must be kidding. Please stop numbering names and use *meaningful* names instead! Could you describe them problem better, with sample inputs and expected outputs. There must be a better way that that unreadable mess above. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Problem with list.insert
Dear Group, I wrote one program, There is a dictionary. There is an input string. Every word of input string the word is matched against the dictionary If the word of input string is matched against the dictionary it gives the word of the dictionary. But if it does not find it gives the original word. After searching the words are joined back. But as I am joining I am finding the words which are not available in dictionary are printed in the last even if the word is given in the first/middle. Now, I want to take them in order. I am applying a thumb rule that the position of the word of the string is exact with the resultant string. So, I am determining the word which is not in the dictionary, and its position in the input string. Now I am inserting it in the target string, for this I am splitting both the given string and the output/result string. Till now it is working fine. But a problem happening is that if I insert it it is inserting same words multiple times and the program seems to be an unending process. What is the error happening? If any one can suggest. The code is given below: import re def wordchecker1(n): # INPUTTING STRING a1=raw_input("PRINT ONE ENGLISH SENTENCE FOR DICTIONARY CHECK:") #CONVERTING TO LOWER CASE a2=a1.lower() #CONVERTING INTO LIST a3=a2.split() #DICTIONARY a4=open("/python25/Changedict3.txt","r") a5=a4.read() a6=a5.split() found=[] not_found=[] #SEARCHING DICTIONARY for x in a3: a7="\n" a8=a7+x if a8 in a5: a9=a5.index(a8) a10=a5[a9:] a11=re.search("\xe0.*?\n",a10) a12=a11.group() a13=a12[:-1] found.append(a13) elif a8 not in a5: a14=x not_found.append(a14) else: print "Error" found.extend(not_found) # THE OUTPUT print "OUTPUT STRING IS" a15=(' '.join(found)) #THE OUTPUT STRING print a15 # SPLITTING OUTPUT STRING IN WORDS a16=a15.split() #TAKING OUT THE WORD FROM OUTPUT STRING for word in a16: #MATCHING WITH GIVEN STRING a17=a2.find(word) if a17>-1: print "The word is found in the Source String" a18=a3.index(word) a19=a3[a18] print a19 #INSERTING IN THE LIST OF TARGET STRING a20=a16.insert(a18,a19) print a16 a21=(" ".join(a16)) print a21 Best Regards, Subhabrata. -- http://mail.python.org/mailman/listinfo/python-list