Re: [Tutor] regular expressions query

2019-05-24 Thread Mats Wichmann
On 5/23/19 6:15 PM, mhysnm1...@gmail.com wrote:
> All,
> 
>  
> 
> Below I am just providing the example of what I want to achieve, not the
> original strings that I will be using the regular expression against.
> 
>  
> 
> The original strings could have:
> 
>  
> 
> "Hello world"
> 
> "hello  World everyone"
> 
> "hello everyone"
> 
> "hello   world and friends"
> 
>  
> 
> I have a string which is "hello world" which I want to identify by using
> regular expression how many times:
> 
> * "hello" occurs on its own.
> * "Hello world" occurs in the list of strings regardless of the number
> of white spaces.

I don't know if you've moved on from this problem, but here's one way
one might tackle finding the hello world's in this relatively simple
scenario:

1. join all the strings into a single string, on the assumption that you
care about substrings that span a line break.
2. use the findall method to hit all instances
3. specify the ingore case flag to the re method
4. specify one-or-more bits of whitespace between words of the substring
in your regular expression pattern.

most of that is assumption since as Alan said, you didn't describe the
problem precisely enough for a programmer, even if it sounds precise
enough in English (e.g. hello occurs on its own - does that mean all
instances of hello, or all instances of hello not followed by world?, etc.)

strings = [ all your stuff ]
hits = re.findall(r'hello\s+world', ' '.join(strings), flags=re.IGNORECASE)

Running this on your sample data shows there are three hits (you can do
len(hits) for that)

===

That's the kind of thing regular expressions are good for, but always
keep in mind that they're not always that simple to wrestle with, which
has led to the infamous quote (credited to Jamie Zawinski, although he
repurposed it from an earlier quote on something different):

Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Fwd: RE: regular expressions query

2019-05-24 Thread Alan Gauld via Tutor


Forwarding to the list, plase use reply-all or reply-list when
responding to list mails.

Alan G.
 Forwarded Message 
Subject:RE: [Tutor] regular expressions query
Date:   Fri, 24 May 2019 20:10:48 +1000
From:   mhysnm1...@gmail.com
To: 'Alan Gauld' 



Allan,

I have gone back to the drawing board as I found to many problems with the
approach I was using. As the original data has multiple spaces between
words. I want to find unique phrases in the strings such as "Hello World"
regardless of the number of spaces that might be in the string.

I have used the following lines of code which finds the number of unique
complete strings.

transaction = [item for item, count in
collections.Counter(narration).items() if count > 1]
none-dup-narration = [item for item, count in
collections.Counter(narration).items() if count < 2]

So I end up with two lists one containing complete unique strings with more
than one occurrence and another with only one. As there is common words in
the none-dup-narration list of strings. I am trying to find a method of
extracting this information. I am still reading collections as this could
help. But wanted to understand if you can inject variables into the pattern
of regular expression which was the intent of the original question. Each
time the regular expression is executed, a different word would be in the
pattern.

In Python 3.7, I want to understand Unions and Maps. I have read information
on this in different places and still don't understand why, how and when you
would use them. Something else I have been wondering.

Goal here is to grow my knowledge in programming.

# end if
# end for
print (count)
# end for
input ()
# end for

-Original Message-
From: Tutor  On Behalf Of
Alan Gauld via Tutor
Sent: Friday, 24 May 2019 7:41 PM
To: tutor@python.org
Subject: Re: [Tutor] regular expressions query

On 24/05/2019 01:15, mhysnm1...@gmail.com wrote:

> Below I am just providing the example of what I want to achieve, not
> the original strings that I will be using the regular expression against.

While I'm sure you understand what you want I'm not sure I do.
Can you be more precise?

> The original strings could have:
> "Hello world"
> "hello World everyone"
> "hello everyone"
> "hello world and friends"

> I have a string which is "hello world" which I want to identify by
> using regular expression how many times:
>
> * "hello" occurs on its own.

Define "on its own" Is the answer for the strings above 4?
Or is it 1 (ie once without an accompanying world)?

> * "Hello world" occurs in the list of strings regardless of the number
> of white spaces.

I assume you mean the answer above should be 3?

Now for each scenario how do we treat

"helloworldeveryone"?
"hello to the world"
"world, hello"
"hello, world"

> Splitting the string into an array ['hello', 'world'] and then
> re-joining it together and using a loop to move through the strings
> does not provide the information I want. So I was wondering if this is
> possible via regular expressions matching?

It is probably possible by splitting the strings and searching, or even just
using multiple standard string searches. But regex is possible too. A lot
depends on the complexity of your real problem statement, rather than the
hello world example you've given. I suspect the real case will be trickier
and therefore more likely to need a regex.

> Modifying the original string is one option. But I was wondering if
> this could be done?

I'm not sure what you have in mind. For searching purposes you shouldn't
need to modify the original string. (Of course Python strings are immutable
so technically you can never modify a string, but in practice you can
achieve the same
effect.)

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] I'm having a small problem with my code

2019-05-24 Thread Alan Gauld via Tutor
Forwarding to the list.
Always use Reply-All or Reply-List when responding to the list.
Otherwise it only goes to the member who posted.

Alan G.


On 24/05/2019 10:20, David Lifschitz wrote:
> Hi.
> I'm learning the processes of python so I'm trying to figure out how
> to sort it manually.
>
> On Fri, May 24, 2019 at 3:00 AM Alan Gauld via Tutor  > wrote:
>
> On 23/05/2019 13:16, David Lifschitz wrote:
>
> > The next job of the code is to sort the list of numbers that
> were inputted
> > in an ascending fashion.
>
> You are aware that Python lists have a built in sort method?
> It will be more efficient than anything you can create
> yourself in Python. Assuming you know that and decided
> to write a sort routine for fun
>
>
> > There is no error from the code, however, when I run the code
> the first
> > inputted number stays in its place and doesn't get sorted with
> the rest of
> > the numbers.
> > Any advice???
>
> Yes, see below:
>
> > emptyList = []
>
> This is a terrible name since it becomes misleading the
> instant you put anything into it!
>
> number_list or similar would be more accurate.
> Or even just 'data'
>
> > nOFN = int(input("how many numbers do you want to sort: "))
> >
> > for x in range(nOFN):
> >?? ?? ??number1 = int(input("input number: "))
> >?? ?? ??emptyList.append(number1)
>
> You could have used a list comprehension:
>
> emptyList = [int(input("input number: ")) for n in range(nOFN)]
>
> Now onto the problem sort code
>
> > firstElement = emptyList[0]
>
> Why did you do this? You never refer to firstElement again...
>
> > n = len(emptyList)
> > for j in range(1, n):
> >?? ?? ??if emptyList[j-1] > emptyList[j]:
> >?? ?? ?? ?? ??(emptyList[j-1], emptyList[j]) = (emptyList[j],
> emptyList[j-1])
>
> Consider the first case, j=1
>
> If the first element is greater than the second
> you swap them. Otherwise you leave them in place.
>
> The loop now considers elements 2 and 3.
> If 2 >3 you reverse them, otherwise move on.
> But if element 3 is less than element 1 you never
> go back to move it to the top.
>
> Consider this example - [3,2,1]
>
> 1st iteration?? ??-> 2,3,1
> 2nd iteration?? ??-> 2,1,3
>
> Loop ends.
> But you never swapped 1 and 2 after(or during) the last iteration.
>
> Your sort routine is fundamentally flawed. You need a rethink.
> But not too much because the built in sort will nearly always be
> preferred!
>
> Incidentally, creating working sort algorithms is one of the
> hardest things to get right in computing. It is one of
> those things that can seem right then one specific pattern
> will break it.
>
> -- 
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> ___
> Tutor maillist?? -?? Tutor@python.org 
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
> -- 
> Sent from an email account


-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] regular expressions query

2019-05-24 Thread Alan Gauld via Tutor
On 24/05/2019 01:15, mhysnm1...@gmail.com wrote:

> Below I am just providing the example of what I want to achieve, not the
> original strings that I will be using the regular expression against.

While I'm sure you understand what you want I'm not sure I do.
Can you be more precise?

> The original strings could have:
> "Hello world"
> "hello  World everyone"
> "hello everyone"
> "hello   world and friends"

> I have a string which is "hello world" which I want to identify by using
> regular expression how many times:
> 
> * "hello" occurs on its own.

Define "on its own" Is the answer for the strings above 4?
Or is it 1 (ie once without an accompanying world)?

> * "Hello world" occurs in the list of strings regardless of the number
> of white spaces.

I assume you mean the answer above should be 3?

Now for each scenario how do we treat

"helloworldeveryone"?
"hello to the world"
"world, hello"
"hello, world"

> Splitting the string into an array ['hello', 'world'] and then re-joining it
> together and using a loop to move through the strings does not provide the
> information I want. So I was wondering if this is possible via regular
> expressions matching?

It is probably possible by splitting the strings and searching,
or even just using multiple standard string searches. But regex is
possible too. A lot depends on the complexity of your real
problem statement, rather than the hello world example
you've given. I suspect the real case will be trickier
and therefore more likely to need a regex.

> Modifying the original string is one option. But I was wondering if this
> could be done?

I'm not sure what you have in mind. For searching purposes
you shouldn't need to modify the original string. (Of course
Python strings are immutable so technically you can never
modify a string, but in practice you can achieve the same
effect.)

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] regular expressions query

2019-05-24 Thread mhysnm1964
All,

 

Below I am just providing the example of what I want to achieve, not the
original strings that I will be using the regular expression against.

 

The original strings could have:

 

"Hello world"

"hello  World everyone"

"hello everyone"

"hello   world and friends"

 

I have a string which is "hello world" which I want to identify by using
regular expression how many times:

*   "hello" occurs on its own.
*   "Hello world" occurs in the list of strings regardless of the number
of white spaces.

 

Splitting the string into an array ['hello', 'world'] and then re-joining it
together and using a loop to move through the strings does not provide the
information I want. So I was wondering if this is possible via regular
expressions matching?

 

Modifying the original string is one option. But I was wondering if this
could be done?

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor