On 07/01/2019 02:38, mhysnm1...@gmail.com wrote: > All the descriptions of the transactions are > in a single column. I am trying to work out the > easiest method of identifying the same pattern > of text in the fields.
What does a singe column mean? That presumably is how it appears in the spreadsheet? But how is it stored in your Python code? A list? a list of lists? a dictionary? We don't know what your data looks like. Post a sample along with an explanation of how it is structured. In general when looking for patterns in text a regular expression is the tool of choice. But only if you know what the pattern looks like. Identifying patterns as you go is a much more difficult challenge > Then I am going to group these vendors by categories. And how do you categorize them? Is the category also in the data or is it some arbitrary thing that you have devised? > In the field, there is the vendor name, suburb/town, type of transaction, etc. etc is kind of vague! Show us some data and tel;l us which field is which. Without that its difficult to impossible to tell you how to extract anything! The important thing is not how it looked in the spreadsheet but how it looks now you have it in Python. > How can I teach the program to learn new vendor names? Usually you would use a set or dictionary and add new names as you find them. > I was thinking of removing all the duplicate entries Using a set would do that for you automatically > Was thinking of using dictionaries for this. > But not sure if this is the best approach. If you make the vendor name the key of a dictionary then it has the same effect as using a set. But whether a set or dict is best depends on what else you need to store. If its only the vendor names then a set is best. If you want to store associated data then a dict is better. You need to be much more specific about what your data looks like, how you identify the fields you want, and how you will categorize them. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor