Re: [Tutor] FW: wierd replace problem

2010-09-15 Thread Roelof Wobben




> Date: Tue, 14 Sep 2010 22:15:40 +0100
> From: wpr...@gmail.com
> To: tutor@python.org
> Subject: Re: [Tutor] FW: wierd replace problem
>
>
>
> On 14 September 2010 21:10, Roelof Wobben
>> wrote:
> I understand it but I try to understand why in a file there is this
> 'word python makes a "'word.
>
> Python doesn't change what it reads from the file. However, depending
> on how you ask Python to tell you what it's read (or what the contents
> of a string variable is), it might display it quoted, and/or include
> escape charaters, or not as the case may be. If what you've read from
> the file contains quotes, then obviously you need to be careful to not
> mistake Python's quoting of the value of a string as being *part* of
> that string. Neither must you mistake the escape character (if
> applicable) from being actually part of the string.
>
> For example, consider the following exchange in the Python shell
> (please try all of this yourself and experiment):
>
>>>> s = 'blah'
>>>> s
> 'blah'
>>>> print s
> blah
>>>>
>
> I assign the value of 'blah' to the string s. So far simple enough.
> Obviosuly the quotes used int the assignment of the string does not
> form part of the string itself. Their utility is only to delineate to
> Python the start of the string, and the end of the string.
>
> In the next line I ask Python to evaluate the expression s, which it
> duly reporst as 'blah'. Again, it's using normal Python convention to
> format the data as a string, because that's what s is, a string
> object. But the quotes are formatting, they're not really part of the
> string.
>
> In the next line I ask Python to print s. Now, the true content of s
> is printed as it is, and hence you can see that the quotes are not part
> of the string.
>
> Now consider the following exchange in the Python shell where I open a
> file and write some text to it to prove this point:
>>>> f = open('test.txt', 'w+')
>>>> f.write('blah')
>>>> f.close()
>>>> import os
>>>> os.system('notepad test.txt')
>
> The last line above opens the text file test.txt in Notepad so you can
> see the contents. As you can see, no quotes or anything else. Now,
> while open, suppose we put a single quote in the file, so it reads:
> 'blah
> ...and suppose we then save it and exit notepad so you're back in the
> Python shell. Then we do:
>
>>>> f=open('test.txt','r+')
>>>> s=f.read()
>>>> f.close()
>>>> s
> "'blah"
>
> Now I've read the contents of the file back into a string variable s,
> and asked Python to evaluate (output) this string object.
>
> Notice, Python is now formatting the string with *doube* quotes
> (previously it defaulted to single quotes) to avoid having to escape
> the single quote that forms part of the string. If Python had used
> single quotes instead, then there would've been an ambiguity with the
> single quote that's part of the string and so it would've had to escape
> that too. So consequently it formats the string with double quotes,
> which is valid Python syntax and avoids the backslash. (Stating the
> obvious, strings can be quoted with double or single quotes.) As
> before, the double quotes, as with the single quotes earlier, are not
> part of the string. They are merely formatting since Python is being
> asked to display a string and hence it must indicate the start and end
> of the string with suitable quote characters.
>
> Now, as before do:
>
>>>> print s
> 'blah
>
> As before, with print you see the contents of the string as it is (and
> as indeed it is also in the file that you saved). Just the single quote
> you added at the front of Blah. No double or single quotes or anything
> else.
>
> Now finally, let's try something a bit more elaborate. Do again:
>
>>>> os.system('notepad test.txt')
>
> Then put into the file the following 2 lines of text (notice the file
> now contains 2 lines, and both single and double quotes...):
> +++"+++This line is double quoted in the file and the quotes have +
> symbols around them.+++"+++
> ---'---This line is single quoted in the file and the quotes have -
> symbols around them.---'---
>
> Save it, exit Notepad, then do:
>>>> f=open('test.txt', 'r+')
>>>> s=f.read()
>>>> f.close()
>>>> s
> &#

Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Steven D'Aprano
On Wed, 15 Sep 2010 07:20:33 am Walter Prins wrote:
> Correction on my last paragraph on my last mail:
> "See also when Python is asked to "print" the string, you can see the
> escape characters really there." -> "See also when Python is asked to
> "print" the string, you can see the escape characters aren't part of
> the actual contents of the string."

I think that half of the confusion here is that people are confused 
about what escape characters are. When you read text from a file, and 
Python sees a backslash, it DOESN'T add a second backslash to escape 
it. Nor does it add quotation marks at the start or end.

Text is text. The characters you read from a text file -- or the bytes 
you read from any file -- remain untouched, exactly as they existed in 
the file. But what changes is the DISPLAY of the text.

When you have the four characters abcd (with no quotation marks) and you 
ask Python to display it on the command line, the display includes 
punctuation, in this case quotation marks, just like a list includes 
punctuation [,] and a dict {:,}. If the string includes certain special 
characters like newlines, tabs, backslashes and quotation marks, the 
display uses escape codes \n \t \\ \' or \" as punctuation to the 
display only. But the extra backslashes don't exist in the string, any 
more than lists include items [ and ].

When you enter string literals, the way to enter them is by typing the 
display form. The display form includes matching quotation marks at the 
beginning and end, and escaping special characters. But that 
punctuation isn't part of the string.



-- 
Steven D'Aprano
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Walter Prins
Correction on my last paragraph on my last mail:
"See also when Python is asked to "print" the string, you can see the escape
characters really there." -> "See also when Python is asked to "print" the
string, you can see the escape characters aren't part of the actual contents
of the string."
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Walter Prins
On 14 September 2010 21:10, Roelof Wobben  wrote:

> I understand it but I try to understand why in a file there is this 'word
> python makes a "'word.
>

Python doesn't change what it reads from the file.  However, depending on
how you ask Python to tell you what it's read (or what the contents of a
string variable is), it might display it quoted, and/or include escape
charaters, or not as the case may be.  If what you've read from the file
contains quotes, then obviously you need to be careful to not mistake
Python's quoting of the value of a string as being *part* of that string.
Neither must you mistake the escape character (if applicable) from being
actually part of the string.

For example, consider the following exchange in the Python shell (please try
all of this yourself and experiment):

>>> s = 'blah'
>>> s
'blah'
>>> print s
blah
>>>

I assign the value of 'blah' to the string s. So far simple enough.
Obviosuly the quotes used int the assignment of the string does not form
part of the string itself.  Their utility is only to delineate to Python the
start of the string, and the end of the string.

In the next line I ask Python to evaluate the expression s, which it duly
reporst as 'blah'.  Again, it's using normal Python convention to format the
data as a string, because that's what s is, a string object.  But the quotes
are formatting, they're not really part of the string.

In the next line I ask Python to *print *s.  Now, the true content of s is
printed as it is, and hence you can see that the quotes are not part of the
string.

Now consider the following exchange in the Python shell where I open a file
and write some text to it to prove this point:
>>> f = open('test.txt', 'w+')
>>> f.write('blah')
>>> f.close()
>>> import os
>>> os.system('notepad test.txt')

The last line above opens the text file test.txt in Notepad so you can see
the contents.  As you can see, no quotes or anything else.  Now, while open,
suppose we put a single quote in the file, so it reads:
'blah
...and suppose we then save it and exit notepad so you're back in the Python
shell.  Then we do:

>>> f=open('test.txt','r+')
>>> s=f.read()
>>> f.close()
>>> s
"'blah"

Now I've read the contents of the file back into a string variable s, and
asked Python to evaluate (output) this string object.

Notice, Python is now formatting the string with *doube* quotes (previously
it defaulted to single quotes) to avoid having to escape the single quote
that forms part of the string.  If Python had used single quotes instead,
then there would've been an ambiguity with the single quote that's part of
the string and so it would've had to escape that too.  So consequently it
formats the string with double quotes, which is valid Python syntax and
avoids the backslash. (Stating the obvious, strings can be quoted with
double or single quotes.)  As before, the double quotes, as with the single
quotes earlier, are not part of the string.  They are merely formatting
since Python is being asked to display a string and hence it must indicate
the start and end of the string with suitable quote characters.

Now, as before do:

>>> print s
'blah

As before, with print you see the contents of the string as it is (and as
indeed it is also in the file that you saved). Just the single quote you
added at the front of Blah. No double or single quotes or anything else.

Now finally, let's try something a bit more elaborate.  Do again:

>>> os.system('notepad test.txt')

Then put into the file the following 2 lines of text (notice the file now
contains 2 lines, and both single and double quotes...):
+++"+++This line is double quoted in the file and the quotes have + symbols
around them.+++"+++
---'---This line is single quoted in the file and the quotes have - symbols
around them.---'---

Save it, exit Notepad, then do:
>>> f=open('test.txt', 'r+')
>>> s=f.read()
>>> f.close()
>>> s
'+++"+++This line is double quoted in the file and the quotes have + symbols
around them.+++"+++\n---\'---This line is single quoted in the file and the
quotes have - symbols around them.---\'---\n'
>>> print s
+++"+++This line is double quoted in the file and the quotes have + symbols
around them.+++"+++
---'---This line is single quoted in the file and the quotes have - symbols
around them.---'---

Notice we read both lines in the file into one single string.  See how
Python formats that as a string object, and escapes not only the single
quotes but also the line break characters (\n).  See also when Python is
asked to "print" the string, you can see the escape characters really there.
See what's happened?  Do you understand why?

Walter
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Roelof Wobben




> Date: Tue, 14 Sep 2010 21:05:06 +0100
> Subject: Re: [Tutor] FW: wierd replace problem
> From: wpr...@gmail.com
> To: rwob...@hotmail.com
> CC: tutor@python.org
>
> Roelof,
>
> On 14 September 2010 17:35, Roelof Wobben
>> wrote:
> But how can I use the triple quotes when reading a textf-file ?
>
> To repeat what I said before, obviously not clearly enough: All the
> quoting stuff, escaping stuff, all of that ONLY APPLIES TO STRINGS/DATA
> INSIDE OF YOUR PYTHON CODE. It does NOT APPLY TO DATA INSIDE OF
> FILES! Why not to files? Because there's no ambiguity in data inside
> a file. It's understood that everything in a file is just data. By
> contrast, in Python code, quote characters have *meaning*.
> Specifically they indicate the start and end of string literals. So
> when they themselves are part of teh string you have to write them
> specially to indicate their meaning, either as closing the string, or
> as part of the string data.
>
> In a file by contrast, every character is presumed to be just a piece
> of data, and so quotes have no special inherent meaning to Python, so
> they just represent themselves and always just form part of the data
> being read from the file. Do you understand what I'm saying? If you
> have any doubt please respond so we can try to get this cleared up --
> Unless and until you realise there's a difference between data in a
> file and string literals in your python source code you're not going to
> undertand what you're doing here.
>
> Regards,
>
> Walter

 
I understand it but I try to understand why in a file there is this 'word 
python makes a "'word.
I know that a file is just data but I thought that if that data is read in a 
string I could work with the quoting stuff.
But im very wrong here.
 
Roelof
 
  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Walter Prins
Roelof,

On 14 September 2010 17:35, Roelof Wobben  wrote:

> But how can I use the triple quotes when reading a textf-file ?
>

To repeat what I said before, obviously not clearly enough:  All the quoting
stuff, escaping stuff, all of that ONLY APPLIES TO STRINGS/DATA INSIDE OF
YOUR PYTHON CODE.  It does NOT APPLY TO DATA INSIDE OF FILES!  Why not to
files?  Because there's no ambiguity in data inside a file.  It's understood
that everything in a file is just data.  By contrast, in Python code, quote
characters have *meaning*.  Specifically they indicate the start and end of
string literals.  So when they themselves are part of teh string you have to
write them specially to indicate their meaning, either as closing the
string, or as part of the string data.

In a file by contrast, every character is presumed to be just a piece of
data, and so quotes have no special inherent meaning to Python, so they just
represent themselves and always just form part of the data being read from
the file.   Do you understand what I'm saying?  If you have any doubt please
respond so we can try to get this cleared up -- Unless and until you realise
there's a difference between data in a file and string literals in your
python source code you're not going to undertand what you're doing here.

Regards,

Walter
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Francesco Loffredo

On 14/09/2010 18.35, Roelof Wobben wrote:

...
It was not confusing when I read your explanation.
Still im grazy wht with you and Joel the strip works and with me I get errors.

But how can I use the triple quotes when reading a textf-file ?

Very easy: YOU DON'T NEED TO USE ANY QUOTES.
All the quoting stuff is ONLY needed for literals, that is for those 
strings you directly write in a program. For example:


MyString = """ these are my favourite quoting characters: "'` """  # 
here you need to avoid ambiguities, because quoting chars are also used 
to start and end the string.

You could obtain EXACTLY the same string with backslash-quoting:

YourString = ' these are my favourite quoting characters: "\'` '  # here 
I had to quote the single-quote char


HerString = " these are my favourite quoting characters: \"'` "  # and 
here the double quote had to be quoted by backslash


But when you DON'T write the string yourself, you don't need any quoting:
ThisString = MyString + YourString + HerString  # no quoting required
ThatString = file.read() # again, NO QUOTING REQUIRED.


Roelof

Francesco
Nessun virus nel messaggio in uscita.
Controllato da AVG - www.avg.com
Versione: 9.0.851 / Database dei virus: 271.1.1/3132 -  Data di rilascio: 
09/13/10 08:35:00
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Roelof Wobben




> From: rwob...@hotmail.com
> To: sander.swe...@gmail.com
> Subject: RE: [Tutor] FW: wierd replace problem
> Date: Tue, 14 Sep 2010 17:40:28 +
>
>
>
>
> 
>> From: sander.swe...@gmail.com
>> To: tutor@python.org
>> Date: Tue, 14 Sep 2010 19:28:28 +0200
>> Subject: Re: [Tutor] FW: wierd replace problem
>>
>>
>> - Original message -
>>> Look at the backslash! It doesn't strip the backslash in the string, but
>>> it escapes the double quote following it.
>>>
>>> I don't know how people can explain it any better.
>>
>> Maybe the link below makes it clear what backslash really does.
>>
>> http://pythonconquerstheuniverse.wordpress.com/2008/06/04/gotcha-%e2%80%94-backslashes-are-escape-characters/
>>
>> Greets
>> sander
>> ___
>> Tutor maillist - Tutor@python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>

Oke,

I get it.
When I want to delete a " I have to use a backslash.

For me case closed. Everyone thanks for the patience and explanations.

Roelof
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Sander Sweers

- Original message -
> Look at the backslash! It doesn't strip the backslash in the string, but 
> it escapes the double quote following it.
> 
> I don't know how people can explain it any better.

Maybe the link below makes it clear what backslash really does.

http://pythonconquerstheuniverse.wordpress.com/2008/06/04/gotcha-%e2%80%94-backslashes-are-escape-characters/

Greets
sander 
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Joel Goldstick
On Tue, Sep 14, 2010 at 12:35 PM, Roelof Wobben  wrote:

>
>
>
>
> But how can I use the triple quotes when reading a textf-file ?
> Roelof
>

The triple quotes let you put quotes inside them... for instance if you want
to check for single and double quotes, and ) you can do this:
 """'")"""

its hard to make out.. so with double spacing (you can't do this.. just to
illustrate:):  """ ' " ) """

>
>
>

-- 
Joel Goldstick
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Timo

On 14-09-10 17:44, Roelof Wobben wrote:

9 stripped_words = words.strip(".,!?'`\"- ();:")

   


 

Hello Joel,

Your solution works.
Im getting grazy. I tried it two days with strip and get a eof error message 
and now no messages.
   
Look at the backslash! It doesn't strip the backslash in the string, but 
it escapes the double quote following it.


I don't know how people can explain it any better.

You *do* see that this doesn't work, right?
>>> s = "foo"bar"

So you can't do this either:
>>> word.strip(",.!";:")

You need to escape the quote between the quotes:
>>> word.strip(".,!\";:")

Timo


Roelof



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
   


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Roelof Wobben




Date: Tue, 14 Sep 2010 17:45:35 +0200
From: f...@libero.it
To: tutor@python.org
Subject: Re: [Tutor] FW: wierd replace problem


On 14/09/2010 16.29, Roelof Wobben wrote:
>...
> Oke,
>
> I see the problem.
>
> When I have this sentence : `'Tis so,' said the Duchess: `and the moral of 
> that is--"Oh,
> 'tis love, 'tis love, that makes the world go round!"'
>
> And I do string.strip() the output will be :
>
> `'This and that one does not fit your explanation.
> So I have to strip everything before I strip it.

After some trial and error with the interpreter, I found this:

st = """`'Tis so,' said the Duchess: `and the moral of that is--"Oh,
'tis love, 'tis love, that makes the world go round!"'""" # notice the
starting and ending triple quotes, just to avoid all the backslashes and
the ambiguities with quoting characters ;-)

wordlist = [thisone.strip("""'",!` :-""") for thisone in st.replace('"',
" ").replace("-"," ").split()]

I don't know if you read the chapter regarding "list comprehensions" in
your tutorial, but this is one of those. First of all, I replace all
double quotes " and dashes - with spaces:
st.replace('"', " ").replace("-"," ")
then I use split() to divide the long string in a list of almost-words.
At the end, with the for clause in the list comprehension, I strip all
leading and trailing non-letters (again, I enclose them in a
triple-quoted string) from each of the elements of the list.

In the end, I have wordlist:

['Tis', 'so', 'said', 'the', 'Duchess', 'and', 'the', 'moral', 'of',
'that', 'is', 'Oh', 'tis', 'love', 'tis', 'love', 'that', 'makes',
'the', 'world', 'go', 'round', '']

What about this? Was it confusing?
>
> Roelof
Francesco
 
hello Franceso,
 
It was not confusing when I read your explanation.
Still im grazy wht with you and Joel the strip works and with me I get errors.
 
But how can I use the triple quotes when reading a textf-file ?
Roelof


___
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor   
  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Francesco Loffredo

On 14/09/2010 16.29, Roelof Wobben wrote:

...
Oke,

I see the problem.

When I have this sentence : `'Tis so,' said the Duchess:  `and the moral of that 
is--"Oh,
'tis love, 'tis love, that makes the world go round!"'

And I do string.strip() the output will be :

`'This and that one does not fit your explanation.
So I have to strip everything before I strip it.


After some trial and error with the interpreter, I found this:

st = """`'Tis so,' said the Duchess:  `and the moral of that is--"Oh, 
'tis love, 'tis love, that makes the world go round!"'"""  # notice the 
starting and ending triple quotes, just to avoid all the backslashes and 
the ambiguities with quoting characters ;-)


wordlist = [thisone.strip("""'",!` :-""") for thisone in st.replace('"', 
" ").replace("-"," ").split()]


I don't know if you read the chapter regarding "list comprehensions" in 
your tutorial, but this is one of those. First of all, I replace all 
double quotes " and dashes - with spaces:

st.replace('"', " ").replace("-"," ")
then I use split() to divide the long string in a list of almost-words.
At the end, with the for clause in the list comprehension, I strip all 
leading and trailing non-letters (again, I enclose them in a 
triple-quoted string) from each of the elements of the list.


In the end, I have wordlist:

['Tis', 'so', 'said', 'the', 'Duchess', 'and', 'the', 'moral', 'of', 
'that', 'is', 'Oh', 'tis', 'love', 'tis', 'love', 'that', 'makes', 
'the', 'world', 'go', 'round', '']


What about this? Was it confusing?


Roelof

Francesco
Nessun virus nel messaggio in uscita.
Controllato da AVG - www.avg.com
Versione: 9.0.851 / Database dei virus: 271.1.1/3132 -  Data di rilascio: 
09/13/10 08:35:00
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Roelof Wobben




> From: rwob...@hotmail.com
> To: joel.goldst...@gmail.com
> Subject: RE: [Tutor] FW: wierd replace problem
> Date: Tue, 14 Sep 2010 15:43:42 +
>
>
>
>
> 
>> Date: Tue, 14 Sep 2010 11:28:10 -0400
>> From: joel.goldst...@gmail.com
>> To: tutor@python.org
>> Subject: Re: [Tutor] FW: wierd replace problem
>>
>>
>>
>> On Tue, Sep 14, 2010 at 10:29 AM, Roelof Wobben
>>> wrote:
>>
>> I offer my solution. I didn't bother to make every word lower case,
>> and I think that would improve the result
>>
>> Please offer critique, improvements
>>
>>
>> Some explaination:
>>
>> line 5 -- I read the complete text into full_text, while first
>> replacing -- with a space
>> line 7 -- I split the full text string into words
>> lines 8 - 15 -- Word by word I strip all sorts of characters that
>> aren't in words from the front and back of each 'word'
>> lines 11 - 14 -- this is EAFP -- try to add one to the bin with that
>> word, if no such bin, make it and give it 1
>> lines 16, 17 -- since dicts don't sort, sort on the keys then loop thru
>> the keys to print out the key (word) and the count
>>
>> 
>> 1 #! /usr/bin/env python
>> 2
>> 3 word_count = {}
>> 4 file = open ('alice_in_wonderland.txt', 'r')
>> 5 full_text = file.read().replace('--',' ')
>> 6
>> 7 full_text_words = full_text.split()
>> 8 for words in full_text_words:
>> 9 stripped_words = words.strip(".,!?'`\"- ();:")
>> 10 ##print stripped_words
>> 11 try:
>> 12 word_count[stripped_words] += 1
>> 13 except KeyError:
>> 14 word_count[stripped_words] = 1
>> 15
>> 16 ordered_keys = word_count.keys()
>> 17 ordered_keys.sort()
>> 18 ##print ordered_keys
>> 19 print "All the words and their frequency in 'alice in wonderland'"
>> 20 for k in ordered_keys:
>> 21 print k, word_count[k]
>> 22
>> --
>> Joel Goldstick
>>
>>
>> ___ Tutor maillist -
>> Tutor@python.org To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>
>

Hello Joel,

Your solution works.
Im getting grazy. I tried it two days with strip and get a eof error message 
and now no messages.

Roelof


  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Joel Goldstick
On Tue, Sep 14, 2010 at 10:29 AM, Roelof Wobben  wrote:

I offer my solution.  I didn't bother to make every word lower case, and I
think that would improve the result

Please offer critique, improvements


Some explaination:

line 5 -- I read the complete text into full_text, while first replacing --
with a space
line 7 -- I split the full text string into words
lines 8 - 15 -- Word by word I strip all sorts of characters that aren't in
words from the front and back of each 'word'
lines 11 - 14 -- this is EAFP -- try to add one to the bin with that word,
if no such bin, make it and give it 1
lines 16, 17 -- since dicts don't sort, sort on the keys then loop thru the
keys to print out the key (word) and the count


> 
>
 1 #! /usr/bin/env python
  2
  3 word_count = {}
  4 file = open ('alice_in_wonderland.txt', 'r')
  5 full_text = file.read().replace('--',' ')
  6
  7 full_text_words = full_text.split()
  8 for words in full_text_words:
  9 stripped_words = words.strip(".,!?'`\"- ();:")
 10 ##print stripped_words
 11 try:
 12 word_count[stripped_words] += 1
 13 except KeyError:
 14 word_count[stripped_words] = 1
 15
 16 ordered_keys = word_count.keys()
 17 ordered_keys.sort()
 18 ##print ordered_keys
 19 print "All the words and their frequency in 'alice in wonderland'"
 20 for k in ordered_keys:
 21 print k, word_count[k]
 22
-- 
Joel Goldstick
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Roelof Wobben




> From: st...@pearwood.info
> To: tutor@python.org
> Date: Tue, 14 Sep 2010 21:30:01 +1000
> Subject: Re: [Tutor] FW: wierd replace problem
>
> On Tue, 14 Sep 2010 05:38:18 pm Roelof Wobben wrote:
>
>>>> Strip ('"'') does not work.
>>>> Still this message : SyntaxError: EOL while scanning string
>>>> literal
> [...]
>> I understand what you mean but we're talking about a text-file which
>> will be read in a string. So I can't escape the quotes. As far as I
>> know I can't control how Python is reading a text-file with quotes.
>
> The text file has nothing to do with this. The text file is fine. The
> error is in the strings that YOU type, not the text file.
>
> Strings must have MATCHING quotation marks:
>
> This is okay: "abcd"
> So is this: 'abcd'
>
> But this is not: "abcd'
>
> You need to *look carefully* at strings you type and make sure that the
> start and end quotes match. Likewise you can't insert the SAME
> quotation mark in a string unless you escape it first:
>
> This is okay: "'Hello,' he said."
> So is this: '"Goodbye," she replied.'
>
> But this is not: 'He said "I can't see you."'
>
> But this is okay: 'He said "I can\'t see you."'
>
>
>
>
> --
> Steven D'Aprano
> ___
> Tutor maillist - Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

Oke, 
 
I see the problem.
 
When I have this sentence : `'Tis so,' said the Duchess:  `and the moral of 
that is--"Oh,
'tis love, 'tis love, that makes the world go round!"'

And I do string.strip() the output will be :
 
`'This and that one does not fit your explanation.
So I have to strip everything before I strip it. 
 
Roelof
  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Steven D'Aprano
On Tue, 14 Sep 2010 05:38:18 pm Roelof Wobben wrote:

> >> Strip ('"'') does not work.
> >> Still this message : SyntaxError: EOL while scanning string
> >> literal
[...]
> I understand what you mean but we're talking about a text-file which
> will be read in a string. So I can't escape the quotes. As far as I
> know I can't control  how Python is reading a text-file with quotes.

The text file has nothing to do with this. The text file is fine. The 
error is in the strings that YOU type, not the text file.

Strings must have MATCHING quotation marks:

This is okay: "abcd"
So is this: 'abcd'

But this is not: "abcd'

You need to *look carefully* at strings you type and make sure that the 
start and end quotes match. Likewise you can't insert the SAME 
quotation mark in a string unless you escape it first:

This is okay: "'Hello,' he said."
So is this: '"Goodbye," she replied.'

But this is not: 'He said "I can't see you."'

But this is okay: 'He said "I can\'t see you."'




-- 
Steven D'Aprano
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Roelof Wobben

Oke, 
 
Can this also be the same problem.
 
In the text is this :
 
'tis is represent as  "'this
 
And this
 
part is represent as part.
 
 
Roelof



> Date: Tue, 14 Sep 2010 11:41:28 +0100
> From: wpr...@gmail.com
> To: tutor@python.org
> Subject: Re: [Tutor] FW: wierd replace problem
>
>
>
> On 14 September 2010 11:09, James Mills
>>
> wrote:
> $ python
> Python 2.6.5 (r265:79063, Jun 13 2010, 14:03:16)
> [GCC 4.4.4 (CRUX)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> s = "foo\"bar'"
>>>> s
> 'foo"bar\''
>
> I'd like to point something out here. Typing "s" as James showed here
> at the prompt outputs a version of the string that Python will
> understand if fed again, consequently it's encoded to do the same
> escaping of characters as you would have to do if you put that
> expression into your python source yourself. If however you entered:
>
> print s
>
> ... then Python would've print the value of s as it really is, withou
> any escape characters or anything else, e.g.:
>
>>>> print s
> foo"bar'
>
> So, even though you see a \' in the output posted by James above, the \
> that was output by Python isn't actually part of the string (e.g. it's
> not really there as such), it was only output that way by Python to
> disambiguate the value of the string.
>
> So, at the risk of belaboring this point to death, if you do:
>
> s = '\'\'\''
>
> then the contents of the string s in memory is '''
>
> The string does not contain the slashes. The slashes are only there to
> help you make Python understand that the quotes must not be interpreted
> as the end of the string, but rather as part of the string. You could
> get the exact same result by doing:
>
> s = "'''"
>
> Here there's no ambiguity and consequently no need for slashes since
> the string is delineated by double quotes and not single quotes.
>
> Hope that helps.
>
> Walter
>
> ___ Tutor maillist -
> Tutor@python.org To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor 
>   
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Francesco Loffredo

On 13/09/2010 20.21, Roelof Wobben wrote:

...

The problem as stated in the book is :


3.Write a program called alice_words.py that creates a text file named 
alice_words.txt containing an alphabetical listing of all the words found in 
alice_in_wonderland.txt together with the number of times each word occurs. The 
first 10 lines of your output file should look something like this:
Word Count
===
a 631
a-piece 1
abide 1
able 1
about 94
above 3
absence 1
absurd 2How many times does the word, alice, occur in the book?

The text can be found here : 
http://openbookproject.net/thinkcs/python/english2e/resources/ch10/alice_in_wonderland.txt

So I open the file.
Read the first rule.

This is no problem for me.

Then I want to remove some characters like ' , " when the word in the text 
begins with these characters.
And there is the problem. The ' and " can't be removed with replace.
Not true. Your problem with replace() and strip() lies in the way you 
gave them the characters to remove, not in the file nor in the functions 
themselves. For example, you said:

Strip ('"'') does not work.
Still this message : SyntaxError: EOL while scanning string literal

Let's take a look into your argument:
(  parens to start the argument
'  first quote begins a string literal;
"  first character to remove, you want to delete all double quotes. OK.
'  ... what is this? Is it the end of the string, or is it another 
character to remove?
' This solves the doubt: '' means that the first string is over, and 
another has started. OK. (maybe not what you expected...)
)  OUCH! The argument has finished, but the string has not! So, End Of 
Line has been reached while your argument string was still open. Error.


The correct syntax (and you know where to look to read it!) should be:
MyText.strip('"\'')
The backslash tells Python that you want to include the following single 
quote in your string without terminating it, then the next single quote 
gives a proper end to the string itself.


Earlier, you said:

I have tried that but then I see this message :

 File "C:\Users\wobben\workspace\oefeningen\src\test.py", line 8
letter2 = letter.strip('`")
  ^
SyntaxError: EOL while scanning string literal

Change it to (''`"") do not help either.
Sure: as some pointed out before, those arguments are NOT proper string 
literals, as the first one doesn't start and end with the SAME quoting 
character, while the second is made from two empty strings with a 
floating backquote ` between them.

If you want to strip (or replace) the three quoting characters
' " and ` , you should enclose them in a properly formatted string 
literal, that is one that starts and ends with the same quoting 
character and that, if must contain the same character, has it quoted by 
backslash:

.strip("'\"`")


Am i now clear what the problem is Im facing.

Roelof


I hope so. After having cut away all unwanted characters, you still have 
to face the main request from your tutorial, and THE question:(Sorry, 
Hamlet!) How many times does the word "alice" occur in the book?


I would guess... none!

Keep up your "fishing"
Francesco
Nessun virus nel messaggio in uscita.
Controllato da AVG - www.avg.com
Versione: 9.0.851 / Database dei virus: 271.1.1/3132 -  Data di rilascio: 
09/13/10 08:35:00
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Walter Prins
On 14 September 2010 11:09, James Mills wrote:

> $ python
> Python 2.6.5 (r265:79063, Jun 13 2010, 14:03:16)
> [GCC 4.4.4 (CRUX)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> s = "foo\"bar'"
> >>> s
> 'foo"bar\''
>

I'd like to point something out here.  Typing "s" as James showed here at
the prompt outputs a version of the string that Python will understand if
fed again, consequently it's encoded to do the same escaping of characters
as you would have to do if you put that expression into your python source
yourself.  If however you entered:

print s

... then Python would've print the value of s as it really is, withou any
escape characters or anything else, e.g.:

>>> print s
foo"bar'

So, even though you see a \' in the output posted by James above, the \ that
was output by Python isn't actually part of the string (e.g. it's not really
there as such), it was only output that way by Python to disambiguate the
value of the string.

So, at the risk of belaboring this point to death, if you do:

s = '\'\'\''

then the contents of the string s in memory is '''

The string does not contain the slashes.  The slashes are only there to help
you make Python understand that the quotes must not be interpreted as the
end of the string, but rather as part of the string.  You could get the
exact same result by doing:

s = "'''"

Here there's no ambiguity and consequently no need for slashes since the
string is delineated by double quotes and not single quotes.

Hope that helps.

Walter
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread James Mills
On Tue, Sep 14, 2010 at 5:28 PM, Roelof Wobben  wrote:
> Strip ('"'') does not work.
> Still this message : SyntaxError: EOL while scanning string literal
>
> So I think I go for the suggestion of Bob en develop a programm which deletes 
> all the ' and " by scanning it character by character.

I seriously don't understand why you're having so much trouble with this
and why this thread has gotten as long as it has :/

Look here:

$ python
Python 2.6.5 (r265:79063, Jun 13 2010, 14:03:16)
[GCC 4.4.4 (CRUX)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = "foo\"bar'"
>>> s
'foo"bar\''
>>> s.replace("\"", "").replace("'", "")
'foobar'
>>>

Surely you can use the same approach ?

cheers
James

-- 
-- James Mills
--
-- "Problems are solved by method"
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Walter Prins
On 14 September 2010 08:38, Roelof Wobben  wrote:

>
> I understand what you mean but we're talking about a text-file which will
> be read in a string.
> So I can't escape the quotes. As far as I know I can't control  how Python
> is reading a text-file with quotes.
>
>
Putting a value into a string via Python source code is not the same as
putting a value into a string by reading a file.  When you read a file it's
implicit that whatever's in the file (including quotes) will end up in the
string.  There's no ambiguity so no need to worry about escaping.

In Python source code however, quotes have meaning implied by context, so
you have to encode/write strings differently *in source* in order to
disambiguate your intention.  In other words, when Python encounters a first
quote in source code, it interprets this as the start of a string.  When it
encounters the next quote of the same type, it interprets this as the end of
the string, and so on.  If you therefore want to put a quote of the same
type you're using to delineate a string inside of that same string, you need
to "escape" the quote, e.g. tell Python to not interpret the quote as it
normally would and to rather take it literally and as part of the current
string value being read.  This you do by putting a backslash (e.g. \ )
before it.  Of course, the same problem happens when you want to put
backslash itself in a string, so the backslash itself also need to be
escaped in order to be put inside a string in Python.

HTH

Walter
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Roelof Wobben




> Date: Tue, 14 Sep 2010 09:32:38 +0200
> From: timomli...@gmail.com
> To: tutor@python.org
> Subject: Re: [Tutor] FW: wierd replace problem
>
> On 14-09-10 09:28, Roelof Wobben wrote:
>>
>>
>> Hello,
>>
>> Strip ('"'') does not work.
>> Still this message : SyntaxError: EOL while scanning string literal
>>
> Review it again, see how many quotes you are using.
>
> For example, this won't work either:
>>>> s = 'foo'bar'
>
> You need to escape the quotes with a backslash, like:
>>>> s = 'foo\'bar'
>>>> print s
> foo'bar
>
>
> Cheers,
> Timo

 
Hello Timo,
 
I understand what you mean but we're talking about a text-file which will be 
read in a string.
So I can't escape the quotes. As far as I know I can't control  how Python is 
reading a text-file with quotes.
 
Roelof
 
 
 
>
>> So I think I go for the suggestion of Bob en develop a programm which 
>> deletes all the ' and " by scanning it character by character.
>>
>> Roelof
>>
>>
>>
>>> 
>>>
>>>> From: st...@pearwood.info
>>>> To: tutor@python.org
>>>> Date: Tue, 14 Sep 2010 09:39:29 +1000
>>>> Subject: Re: [Tutor] wierd replace problem
>>>>
>>>> On Tue, 14 Sep 2010 09:08:24 am Joel Goldstick wrote:
>>>>
>>>>> On Mon, Sep 13, 2010 at 6:41 PM, Steven D'Aprano
>>>>>
>>>> wrote:
>>>>
>>>>>> On Tue, 14 Sep 2010 04:18:36 am Joel Goldstick wrote:
>>>>>>
>>>>>>> How about using str.split() to put words in a list, then run
>>>>>>> strip() over each word with the required characters to be removed
>>>>>>> ('`")
>>>>>>>
>>>>>> Doesn't work. strip() only removes characters at the beginning and
>>>>>> end of the word, not in the middle:
>>>>>>
>>>>> Exactly, you first split the words into a list of words, then strip
>>>>> each word
>>>>>
>>>> Of course, if you don't want to remove ALL punctuation marks, but only
>>>> those at the beginning and end of words, then strip() is a reasonable
>>>> approach. But if the aim is to strip out all punctuation, no matter
>>>> where, then it can't work.
>>>>
>>>> Since the aim is to count words, a better approach might be a hybrid --
>>>> remove all punctuation marks like commas, fullstops, etc. no matter
>>>> where they appear, keep internal apostrophes so that words like "can't"
>>>> are different from "cant", but remove external ones. Although that
>>>> loses information in the case of (e.g.) dialect speech:
>>>>
>>>> "'e said 'e were going to kill the lady, Mister Holmes!"
>>>> cried the lad excitedly.
>>>>
>>>> You probably want to count the word as 'e rather than just e.
>>>>
>>>> And hyphenation is tricky to. A lone hyphen - like these - should be
>>>> deleted. But double-dashes--like these--are word separators, so need to
>>>> be replaced by a space. Otherwise, single hyphens should be kept. If a
>>>> word begins or ends with a hyphen, it should be be joined up with the
>>>> previous or next word. But then it gets more complicated, because you
>>>> don't know whether to keep the hyphen after joining or not.
>>>>
>>>> E.g. if the line ends with:
>>>>
>>>> blah blah blah blah some-
>>>> thing blah blah blah.
>>>>
>>>> should the joined up word become the compound word "some-thing" or the
>>>> regular word "something"? In general, there's no way to be sure,
>>>> although you can make a good guess by looking it up in a dictionary and
>>>> assuming that regular words should be preferred to compound words. But
>>>> that will fail if the word has changed over time, such as "cooperate",
>>>> which until very recently used to be written "co-operate", and before
>>>> that as "coöperate".
>>>>
>>>>
>>>>
>>>> --
>>>> Steven D'Aprano
>>>> ___
>>>> Tutor maillist - Tutor@python.org
>>>> To unsubscribe or change subscription options:
>>>> http://mail.python.org/mailman/listinfo/tutor
>>>>
>> ___
>> Tutor maillist - Tutor@python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>>
>
> ___
> Tutor maillist - Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor 
>   
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-14 Thread Timo

On 14-09-10 09:28, Roelof Wobben wrote:



Hello,

Strip ('"'') does not work.
Still this message : SyntaxError: EOL while scanning string literal
   

Review it again, see how many quotes you are using.

For example, this won't work either:
>>> s = 'foo'bar'

You need to escape the quotes with a backslash, like:
>>> s = 'foo\'bar'
>>> print s
foo'bar


Cheers,
Timo


So I think I go for the suggestion of Bob en develop a programm which deletes all 
the ' and " by scanning it character by character.

  Roelof


   


 

From: st...@pearwood.info
To: tutor@python.org
Date: Tue, 14 Sep 2010 09:39:29 +1000
Subject: Re: [Tutor] wierd replace problem

On Tue, 14 Sep 2010 09:08:24 am Joel Goldstick wrote:
   

On Mon, Sep 13, 2010 at 6:41 PM, Steven D'Aprano
 

wrote:
   

On Tue, 14 Sep 2010 04:18:36 am Joel Goldstick wrote:
   

How about using str.split() to put words in a list, then run
strip() over each word with the required characters to be removed
('`")
 

Doesn't work. strip() only removes characters at the beginning and
end of the word, not in the middle:
   

Exactly, you first split the words into a list of words, then strip
each word
 

Of course, if you don't want to remove ALL punctuation marks, but only
those at the beginning and end of words, then strip() is a reasonable
approach. But if the aim is to strip out all punctuation, no matter
where, then it can't work.

Since the aim is to count words, a better approach might be a hybrid --
remove all punctuation marks like commas, fullstops, etc. no matter
where they appear, keep internal apostrophes so that words like "can't"
are different from "cant", but remove external ones. Although that
loses information in the case of (e.g.) dialect speech:

"'e said 'e were going to kill the lady, Mister Holmes!"
cried the lad excitedly.

You probably want to count the word as 'e rather than just e.

And hyphenation is tricky to. A lone hyphen - like these - should be
deleted. But double-dashes--like these--are word separators, so need to
be replaced by a space. Otherwise, single hyphens should be kept. If a
word begins or ends with a hyphen, it should be be joined up with the
previous or next word. But then it gets more complicated, because you
don't know whether to keep the hyphen after joining or not.

E.g. if the line ends with:

blah blah blah blah some-
thing blah blah blah.

should the joined up word become the compound word "some-thing" or the
regular word "something"? In general, there's no way to be sure,
although you can make a good guess by looking it up in a dictionary and
assuming that regular words should be preferred to compound words. But
that will fail if the word has changed over time, such as "cooperate",
which until very recently used to be written "co-operate", and before
that as "coöperate".



--
Steven D'Aprano
___
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor   
   

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
   


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] FW: wierd replace problem

2010-09-14 Thread Roelof Wobben



Hello,

Strip ('"'') does not work.
Still this message : SyntaxError: EOL while scanning string literal

So I think I go for the suggestion of Bob en develop a programm which deletes 
all the ' and " by scanning it character by character.

 Roelof


> 
>> From: st...@pearwood.info
>> To: tutor@python.org
>> Date: Tue, 14 Sep 2010 09:39:29 +1000
>> Subject: Re: [Tutor] wierd replace problem
>>
>> On Tue, 14 Sep 2010 09:08:24 am Joel Goldstick wrote:
>>> On Mon, Sep 13, 2010 at 6:41 PM, Steven D'Aprano
>> wrote:
 On Tue, 14 Sep 2010 04:18:36 am Joel Goldstick wrote:
> How about using str.split() to put words in a list, then run
> strip() over each word with the required characters to be removed
> ('`")

 Doesn't work. strip() only removes characters at the beginning and
 end of the word, not in the middle:
>>>
>>> Exactly, you first split the words into a list of words, then strip
>>> each word
>>
>> Of course, if you don't want to remove ALL punctuation marks, but only
>> those at the beginning and end of words, then strip() is a reasonable
>> approach. But if the aim is to strip out all punctuation, no matter
>> where, then it can't work.
>>
>> Since the aim is to count words, a better approach might be a hybrid --
>> remove all punctuation marks like commas, fullstops, etc. no matter
>> where they appear, keep internal apostrophes so that words like "can't"
>> are different from "cant", but remove external ones. Although that
>> loses information in the case of (e.g.) dialect speech:
>>
>> "'e said 'e were going to kill the lady, Mister Holmes!"
>> cried the lad excitedly.
>>
>> You probably want to count the word as 'e rather than just e.
>>
>> And hyphenation is tricky to. A lone hyphen - like these - should be
>> deleted. But double-dashes--like these--are word separators, so need to
>> be replaced by a space. Otherwise, single hyphens should be kept. If a
>> word begins or ends with a hyphen, it should be be joined up with the
>> previous or next word. But then it gets more complicated, because you
>> don't know whether to keep the hyphen after joining or not.
>>
>> E.g. if the line ends with:
>>
>> blah blah blah blah some-
>> thing blah blah blah.
>>
>> should the joined up word become the compound word "some-thing" or the
>> regular word "something"? In general, there's no way to be sure,
>> although you can make a good guess by looking it up in a dictionary and
>> assuming that regular words should be preferred to compound words. But
>> that will fail if the word has changed over time, such as "cooperate",
>> which until very recently used to be written "co-operate", and before
>> that as "coöperate".
>>
>>
>>
>> --
>> Steven D'Aprano
>> ___
>> Tutor maillist - Tutor@python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>>   
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-13 Thread Steven D'Aprano
On Tue, 14 Sep 2010 08:05:33 am bob gailer wrote:

> > 3.Write a program called alice_words.py that creates a text file
> > named alice_words.txt containing an alphabetical listing of all the
> > words found in alice_in_wonderland.txt together with the number of
> > times each word occurs. The first 10 lines of your output file
> > should look something like this: Word Count
> > ===
> > a 631
> > a-piece 1
> > abide 1
> > able 1
> > about 94
> > above 3
> > absence 1
> > absurd 2
> > How many times does the word, alice, occur in the book?
>
> We still do not have a definition of "word". Only some examples.

Nor do we have a definition of "text", "file", "alphabetical", "first", 
"10", "lines", "definition", or "overly pedantic".

A reasonable person would use the common meaning of all of these words, 
unless otherwise told differently. In this case, the only ambiguity is 
whether hyphenated words like "a-piece" should count as two words or 
one, but fortunately the example above clearly shows that it should 
count as a single word.

This is an exercise, not a RFC. Be reasonable.


-- 
Steven D'Aprano
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-13 Thread bob gailer

 On 9/13/2010 2:21 PM, Roelof Wobben wrote:






From: rwob...@hotmail.com
To: bgai...@gmail.com
Subject: RE: [Tutor] wierd replace problem
Date: Mon, 13 Sep 2010 18:19:43 +

I suggest you give a clear, complete and correct problem statement.
Right now we are shooting in the dark at a moving target.

Something like.

Given the file alice_in_wonderland.txt, copied from url so-and-so

Remove these characters ...

Split into words (not letters?) where word is defined as

Count the frequency of each word. =

Hello,

The problem as stated in the book is :


3.Write a program called alice_words.py that creates a text file named 
alice_words.txt containing an alphabetical listing of all the words found in 
alice_in_wonderland.txt together with the number of times each word occurs. The 
first 10 lines of your output file should look something like this:
Word Count
===
a 631
a-piece 1
abide 1
able 1
about 94
above 3
absence 1
absurd 2
How many times does the word, alice, occur in the book?


We still do not have a definition of "word". Only some examples.


The text can be found here : 
http://openbookproject.net/thinkcs/python/english2e/resources/ch10/alice_in_wonderland.txt

So I open the file.
Read the first rule.

This is no problem for me.

Then I want to remove some characters like ' , " when the word in the text 
begins with these characters.
And there is the problem. The ' and " can't be removed with replace.


Not true. replace() will replace any character. You wrote in your other post


 letter2 = letter.strip('`")



 SyntaxError: EOL while scanning string literal



 Change it to (''`"") do not help either.


Do you understand the error? strip expects a string.
'`" and ''`"" are NOT strings. Please review Python syntax for string literals.
Here again we bump into a fundamental problem - your not understanding some of 
the basics of Python.


So in the output you will see something like this "dark instead of dark

word is the words of the sentence which is read in from the text-file.

Am i now clear what the problem is Im facing.
Somewhat clearer. We need a definition of "word". Examples help but are 
not definitions.


Example - word is a string of characters including a-z and -. The first 
and last characters must be in a-z. Your definition may be different.


BTW see http://dictionary.reference.com/browse/%27tis where 'tis IS a word.

Your original program (what DID become of the backslash???) is WAY off 
the mark. You must process one character at a time, decide whether it is 
the beginning of a word, the end of a word, within a word, or outside 
any word.


Take the beginning of the alice file, and BY HAND decide which category 
the first character is in. Then the 2nd. etc. That gives you the 
algorithm, Then translate that to Python.


Keep fishing. One day the struggle will be over.

HTH

--
Bob Gailer
919-636-4239
Chapel Hill NC

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] FW: wierd replace problem

2010-09-13 Thread Roelof Wobben




> Date: Mon, 13 Sep 2010 15:31:08 -0400
> Subject: Re: [Tutor] FW: wierd replace problem
> From: joel.goldst...@gmail.com
> To: rwob...@hotmail.com
>
>
>
> On Mon, Sep 13, 2010 at 2:24 PM, Roelof Wobben
>> wrote:
>
>
>
> 
>> From: rwob...@hotmail.com
>> To: joel.goldst...@gmail.com
>> Subject: RE: [Tutor] wierd replace problem
>> Date: Mon, 13 Sep 2010 18:23:36 +
>>
>>
>>
>>
>> 
>>> Date: Mon, 13 Sep 2010 14:18:36 -0400
>>> From: joel.goldst...@gmail.com
>>> To: tutor@python.org
>>> Subject: Re: [Tutor] wierd replace problem
>>>
>>>
>>>
>>> On Mon, Sep 13, 2010 at 2:08 PM, bob gailer
>>>> wrote:
>>> On 9/13/2010 1:50 PM, Roelof Wobben wrote:
>>>
>>> [snip]
>>>
>>> hello Alan,
>>>
>>> Your right. Then it prints like this "'tis
>>> Which is not right. It must be tis.
>>> So the replace does not what it supposed to do.
>>>
>>> Sorry but I am now more confused. After discovering no \ in the text
>>> file now you seem to have have a new specification, which is to get rid
>>> of the '.
>>>
>>> I suggest you give a clear, complete and correct problem statement.
>>> Right now we are shooting in the dark at a moving target.
>>>
>>> Something like.
>>>
>>> Given the file alice_in_wonderland.txt, copied from url so-and-so
>>>
>>> Remove these characters ...
>>>
>>> Split into words (not letters?) where word is defined as
>>>
>>> Count the frequency of each word.
>>>
>>>
>>> --
>>> Bob Gailer
>>> 919-636-4239
>>> Chapel Hill NC
>>>
>>> ___
>>> Tutor maillist - Tutor@python.org
>>> To unsubscribe or change subscription options:
>>> http://mail.python.org/mailman/listinfo/tutor
>>>
>>> How about using str.split() to put words in a list, then run strip()
>>> over each word with the required characters to be removed ('`")
>>>
>>> --
>>> Joel Goldstick
>>>
>>>
>>> ___ Tutor maillist -
>>> Tutor@python.org To unsubscribe or change
> subscription options:
>>> http://mail.python.org/mailman/listinfo/tutor
>>
>
> Hello Joel.
>
> That can be a solution but when i have --dark the -- must be removed.
> But in a-piece the - must not be removed.
>
> Roelof
> ___
> Tutor maillist - Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
> strip only removes from start and end of string. Not from the middle,
> so a-piece would stay as a word
>
> --
> Joel Goldstick
>
 
Oke, 
 
I have tried that but then I see this message :
 
 File "C:\Users\wobben\workspace\oefeningen\src\test.py", line 8
letter2 = letter.strip('`")
  ^
SyntaxError: EOL while scanning string literal
 
Change it to (''`"") do not help either.
 
Roelof

  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] FW: wierd replace problem

2010-09-13 Thread Roelof Wobben




> From: rwob...@hotmail.com
> To: joel.goldst...@gmail.com
> Subject: RE: [Tutor] wierd replace problem
> Date: Mon, 13 Sep 2010 18:23:36 +
>
>
>
>
> 
>> Date: Mon, 13 Sep 2010 14:18:36 -0400
>> From: joel.goldst...@gmail.com
>> To: tutor@python.org
>> Subject: Re: [Tutor] wierd replace problem
>>
>>
>>
>> On Mon, Sep 13, 2010 at 2:08 PM, bob gailer
>>> wrote:
>> On 9/13/2010 1:50 PM, Roelof Wobben wrote:
>>
>> [snip]
>>
>> hello Alan,
>>
>> Your right. Then it prints like this "'tis
>> Which is not right. It must be tis.
>> So the replace does not what it supposed to do.
>>
>> Sorry but I am now more confused. After discovering no \ in the text
>> file now you seem to have have a new specification, which is to get rid
>> of the '.
>>
>> I suggest you give a clear, complete and correct problem statement.
>> Right now we are shooting in the dark at a moving target.
>>
>> Something like.
>>
>> Given the file alice_in_wonderland.txt, copied from url so-and-so
>>
>> Remove these characters ...
>>
>> Split into words (not letters?) where word is defined as
>>
>> Count the frequency of each word.
>>
>>
>> --
>> Bob Gailer
>> 919-636-4239
>> Chapel Hill NC
>>
>> ___
>> Tutor maillist - Tutor@python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>>
>> How about using str.split() to put words in a list, then run strip()
>> over each word with the required characters to be removed ('`")
>>
>> --
>> Joel Goldstick
>>
>>
>> ___ Tutor maillist -
>> Tutor@python.org To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>

Hello Joel.

That can be a solution but when i have --dark the -- must be removed.
But in a-piece the - must not be removed.

Roelof
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] FW: wierd replace problem

2010-09-13 Thread Roelof Wobben





> From: rwob...@hotmail.com
> To: bgai...@gmail.com
> Subject: RE: [Tutor] wierd replace problem
> Date: Mon, 13 Sep 2010 18:19:43 +
>
>
>
>
> 
>> Date: Mon, 13 Sep 2010 14:08:46 -0400
>> From: bgai...@gmail.com
>> To: tutor@python.org
>> Subject: Re: [Tutor] wierd replace problem
>>
>> On 9/13/2010 1:50 PM, Roelof Wobben wrote:
>>
>> [snip]
>>> hello Alan,
>>>
>>> Your right. Then it prints like this "'tis
>>> Which is not right. It must be tis.
>>> So the replace does not what it supposed to do.
>>>
>> Sorry but I am now more confused. After discovering no \ in the text
>> file now you seem to have have a new specification, which is to get rid
>> of the '.
>>
>> I suggest you give a clear, complete and correct problem statement.
>> Right now we are shooting in the dark at a moving target.
>>
>> Something like.
>>
>> Given the file alice_in_wonderland.txt, copied from url so-and-so
>>
>> Remove these characters ...
>>
>> Split into words (not letters?) where word is defined as
>>
>> Count the frequency of each word.
>>
>> --
>> Bob Gailer
>> 919-636-4239
>> Chapel Hill NC
>>
>> ___
>> Tutor maillist - Tutor@python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>
> Hello,
>
> The problem as stated in the book is :
>

3.Write a program called alice_words.py that creates a text file named 
alice_words.txt containing an alphabetical listing of all the words found in 
alice_in_wonderland.txt together with the number of times each word occurs. The 
first 10 lines of your output file should look something like this:
Word Count
===
a 631
a-piece 1
abide 1
able 1
about 94
above 3
absence 1
absurd 2How many times does the word, alice, occur in the book?

The text can be found here : 
http://openbookproject.net/thinkcs/python/english2e/resources/ch10/alice_in_wonderland.txt

So I open the file.
Read the first rule.

This is no problem for me.

Then I want to remove some characters like ' , " when the word in the text 
begins with these characters.
And there is the problem. The ' and " can't be removed with replace.
So in the output you will see something like this "dark instead of dark

word is the words of the sentence which is read in from the text-file.

Am i now clear what the problem is Im facing.

Roelof
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor