[Tutor] Using Regular Expression to extracting string in brackets on a list

2013-12-29 Thread Jing Ai
Hello,

I am trying to rewrite some contents on a long list that contains words
within brackets and outside brackets and I'm having trouble extracting the
words within brackets, especially since I have to add the append function
for list as well.  Does anyone have any suggestions? Thank you!

*An example of list*:

['hypothetical protein BRAFLDRAFT_208408 [Branchiostoma floridae]\n',
'hypoxia-inducible factor 1-alpha [Mus musculus]\n', 'hypoxia-inducible
factor 1-alpha [Gallus gallus]\n' ]

*What I'm trying to extract out of this*:

['Branchiostoma floridae', 'Mus musculus', 'Gallus gallus']
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Using Regular Expression to extracting string in brackets on a list

2013-12-29 Thread Joel Goldstick
On Sun, Dec 29, 2013 at 4:02 PM, Jing Ai jai...@g.rwu.edu wrote:

 Hello,

 I am trying to rewrite some contents on a long list that contains words
 within brackets and outside brackets and I'm having trouble extracting the
 words within brackets, especially since I have to add the append function
 for list as well.  Does anyone have any suggestions? Thank you!

 *An example of list*:

 ['hypothetical protein BRAFLDRAFT_208408 [Branchiostoma floridae]\n',
 'hypoxia-inducible factor 1-alpha [Mus musculus]\n', 'hypoxia-inducible
 factor 1-alpha [Gallus gallus]\n' ]


Is the above line a python  list, or is it what you get when you read a
line of a data file.  The reason I ask, is if it is a list you can split
the list by looping of each list item.  Then just maybe try some of these
ideas:

http://stackoverflow.com/questions/10017147/python-replace-characters-in-string

 *What I'm trying to extract out of this*:

 ['Branchiostoma floridae', 'Mus musculus', 'Gallus gallus']





 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor




-- 
Joel Goldstick
http://joelgoldstick.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Using Regular Expression to extracting string in brackets on a list

2013-12-29 Thread Steven D'Aprano
On Sun, Dec 29, 2013 at 04:02:01PM -0500, Jing Ai wrote:
 Hello,
 
 I am trying to rewrite some contents on a long list that contains words
 within brackets and outside brackets and I'm having trouble extracting the
 words within brackets, especially since I have to add the append function
 for list as well.  Does anyone have any suggestions? Thank you!
 
 *An example of list*:
 
 ['hypothetical protein BRAFLDRAFT_208408 [Branchiostoma floridae]\n',
 'hypoxia-inducible factor 1-alpha [Mus musculus]\n', 'hypoxia-inducible
 factor 1-alpha [Gallus gallus]\n' ]
 
 *What I'm trying to extract out of this*:
 
 ['Branchiostoma floridae', 'Mus musculus', 'Gallus gallus']

You have a list of strings. Each string has exactly one pair of square 
brackets []. You want the content of the square brackets.

Start with a function that extracts the content of the square brackets 
from a single string.

def extract(s):
start = s.find('[')
if start == -1:
# No opening bracket found. Should this be an error?
return ''
start += 1  # skip the bracket, move to the next character
end = s.find(']', start)
if end == -1:
# No closing bracket found after the opening bracket.
# Should this be an error instead?
return s[start:]
else:
return s[start:end]


Let's test it and see if it works:

py s = 'hypothetical protein BRAFLDRAFT_208408 [Branchiostoma floridae]\n'
py extract(s)
'Branchiostoma floridae'

So far so good. Now let's write a loop:

names = []
for line in list_of_strings:
names.append(extract(line))


where list_of_strings is your big list like the example above.

We can simplify the loop by using a list comprehension:

names = [extract(line) for line in list_of_strings]


If you prefer to use a regular expression, that's simple enough. Here's 
a new version of the extract function:

import re
def extract(s):
mo = re.search(r'\[(.*)\]', s)
if mo:
return mo.group(1)
return ''


The list comprehension remains the same.


-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Using Regular Expression to extracting string in brackets on a list

2013-12-29 Thread Joel Goldstick
On Sun, Dec 29, 2013 at 9:00 PM, Jing Ai jai...@g.rwu.edu wrote:

 Thanks, but I don't think I can get out the string in the brackets by only
 replacing other items...(there's too many things to replace and may
 interfere with the items within the string).




I am not sure what you mean by your previous sentence.  Check out Steven's
excellent answer.  Also, remember to reply to the list, or no one will see
your question.

Good luck




 On Sun, Dec 29, 2013 at 7:08 PM, Joel Goldstick 
 joel.goldst...@gmail.comwrote:




 On Sun, Dec 29, 2013 at 4:02 PM, Jing Ai jai...@g.rwu.edu wrote:

 Hello,

 I am trying to rewrite some contents on a long list that contains words
 within brackets and outside brackets and I'm having trouble extracting the
 words within brackets, especially since I have to add the append function
 for list as well.  Does anyone have any suggestions? Thank you!

 *An example of list*:

 ['hypothetical protein BRAFLDRAFT_208408 [Branchiostoma floridae]\n',
 'hypoxia-inducible factor 1-alpha [Mus musculus]\n', 'hypoxia-inducible
 factor 1-alpha [Gallus gallus]\n' ]


 Is the above line a python  list, or is it what you get when you read a
 line of a data file.  The reason I ask, is if it is a list you can split
 the list by looping of each list item.  Then just maybe try some of these
 ideas:


 http://stackoverflow.com/questions/10017147/python-replace-characters-in-string

  *What I'm trying to extract out of this*:

 ['Branchiostoma floridae', 'Mus musculus', 'Gallus gallus']





 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor




 --
 Joel Goldstick
 http://joelgoldstick.com





-- 
Joel Goldstick
http://joelgoldstick.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Using Regular Expression to extracting string in brackets on a list

2013-12-29 Thread Jing Ai
Thank you all for the suggestions! I decided to use Steven's re loop in the
end.

Joel, what i meant earlier was that the link you sent seems to suggest me
to replace some characters in the list and I'm not sure how it would work...





On Sun, Dec 29, 2013 at 9:24 PM, Joel Goldstick joel.goldst...@gmail.comwrote:




 On Sun, Dec 29, 2013 at 9:00 PM, Jing Ai jai...@g.rwu.edu wrote:

 Thanks, but I don't think I can get out the string in the brackets by
 only replacing other items...(there's too many things to replace and may
 interfere with the items within the string).




 I am not sure what you mean by your previous sentence.  Check out Steven's
 excellent answer.  Also, remember to reply to the list, or no one will see
 your question.

 Good luck




 On Sun, Dec 29, 2013 at 7:08 PM, Joel Goldstick joel.goldst...@gmail.com
  wrote:




 On Sun, Dec 29, 2013 at 4:02 PM, Jing Ai jai...@g.rwu.edu wrote:

 Hello,

 I am trying to rewrite some contents on a long list that contains words
 within brackets and outside brackets and I'm having trouble extracting the
 words within brackets, especially since I have to add the append function
 for list as well.  Does anyone have any suggestions? Thank you!

 *An example of list*:

 ['hypothetical protein BRAFLDRAFT_208408 [Branchiostoma floridae]\n',
 'hypoxia-inducible factor 1-alpha [Mus musculus]\n', 'hypoxia-inducible
 factor 1-alpha [Gallus gallus]\n' ]


 Is the above line a python  list, or is it what you get when you read a
 line of a data file.  The reason I ask, is if it is a list you can split
 the list by looping of each list item.  Then just maybe try some of these
 ideas:


 http://stackoverflow.com/questions/10017147/python-replace-characters-in-string

  *What I'm trying to extract out of this*:

 ['Branchiostoma floridae', 'Mus musculus', 'Gallus gallus']





 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor




 --
 Joel Goldstick
 http://joelgoldstick.com





 --
 Joel Goldstick
 http://joelgoldstick.com

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Using Regular Expression to extracting string in brackets on a list

2013-12-29 Thread Bod Soutar
Steven's answer is probably a lot more robust, but I would use a simple
split.

mylist = ['hypothetical protein BRAFLDRAFT_208408 [Branchiostoma
floridae]\n', 'hypoxia-inducible factor 1-alpha [Mus musculus]\n',
'hypoxia-inducible factor 1-alpha [Gallus gallus]\n' ]
for item in mylist:
item.split([)[1].split(])[0]

-- Bodsda


On 30 December 2013 03:28, Jing Ai jai...@g.rwu.edu wrote:

 Thank you all for the suggestions! I decided to use Steven's re loop in
 the end.

 Joel, what i meant earlier was that the link you sent seems to suggest me
 to replace some characters in the list and I'm not sure how it would work...





 On Sun, Dec 29, 2013 at 9:24 PM, Joel Goldstick 
 joel.goldst...@gmail.comwrote:




 On Sun, Dec 29, 2013 at 9:00 PM, Jing Ai jai...@g.rwu.edu wrote:

 Thanks, but I don't think I can get out the string in the brackets by
 only replacing other items...(there's too many things to replace and may
 interfere with the items within the string).




 I am not sure what you mean by your previous sentence.  Check out
 Steven's excellent answer.  Also, remember to reply to the list, or no one
 will see your question.

 Good luck




 On Sun, Dec 29, 2013 at 7:08 PM, Joel Goldstick 
 joel.goldst...@gmail.com wrote:




 On Sun, Dec 29, 2013 at 4:02 PM, Jing Ai jai...@g.rwu.edu wrote:

 Hello,

 I am trying to rewrite some contents on a long list that contains
 words within brackets and outside brackets and I'm having trouble
 extracting the words within brackets, especially since I have to add the
 append function for list as well.  Does anyone have any suggestions? Thank
 you!

 *An example of list*:

 ['hypothetical protein BRAFLDRAFT_208408 [Branchiostoma floridae]\n',
 'hypoxia-inducible factor 1-alpha [Mus musculus]\n', 'hypoxia-inducible
 factor 1-alpha [Gallus gallus]\n' ]


 Is the above line a python  list, or is it what you get when you read a
 line of a data file.  The reason I ask, is if it is a list you can split
 the list by looping of each list item.  Then just maybe try some of these
 ideas:


 http://stackoverflow.com/questions/10017147/python-replace-characters-in-string

  *What I'm trying to extract out of this*:

 ['Branchiostoma floridae', 'Mus musculus', 'Gallus gallus']





 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor




 --
 Joel Goldstick
 http://joelgoldstick.com





 --
 Joel Goldstick
 http://joelgoldstick.com



 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 https://mail.python.org/mailman/listinfo/tutor


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor