Re: Quickie - Regexp for a string not at the beginning of the line
On 10/25/2012 11:45 PM, Rivka Miller wrote: Thanks everyone, esp this gentleman. The solution that worked best for me is just to use a DOT before the string as the one at the beginning of the line did not have any char before it. That's fine but do you understand that that is not an RE that matches on $hello$ not at the start of a line, it's an RE that matches on any char$hello$ anywhere in the line? There's a difference - if you use a tool that prints the text that matches an RE then the output if the first RE existed would be $hello$ while the output for the second RE would be X$hello$ or Y$hello$ or In some tools you can use /(.)$hello$/ or similar to ignore the first part of the RE (.) and just print the second $hello, but that ability and it's syntax is tool-specific, you still can't say here's an RE that does this, you've got to say here's how to find this text using tool whatever. Ed. I guess, this requires the ability to ignore the CARAT as the beginning of the line. I am a satisfied custormer. No need for returns. :) On Oct 25, 7:11 pm, Ben Bacarisse ben.use...@bsb.me.uk wrote: Rivka Miller rivkaumil...@gmail.com writes: On Oct 25, 2:27 pm, Danny dann90...@gmail.com wrote: Why you just don't give us the string/input, say a line or two, and what you want off of it, so we can tell better what to suggest no one has really helped yet. Really? I was going to reply but then I saw Janis had given you the answer. If it's not the answer, you should just reply saying what it is that's wrong with it. I want to search and modify. Ah. That was missing from the original post. You can't expect people to help with questions that weren't asked! To replace you will usually have to capture the single preceding character. E.g. in sed: sed -e 's/\(.\)$hello\$/\1XXX/' but some RE engines (Perl's, for example) allow you specify zero-width assertions. You could, in Perl, write s/(?=.)\$hello\$/XXX/ without having to capture whatever preceded the target string. But since Perl also has negative zero-width look-behind you can code your request even more directly: s/(?!^)\$hello\$/XXX/ I dont wanna be tied to a specific language etc so I just want a regexp and as many versions as possible. Maybe I should try in emacs and so I am now posting to emacs groups also, although javascript has rich set of regexp facilities. You can't always have a universal solution because different PE implementations have different syntax and semantics, but you should be able to translate Janis's solution of matching *something* before your target into every RE implementation around. examples $hello$ should not be selected but not hello but all of the $hello$ and $hello$ ... $hello$ each one selected I have taken your $s to be literal. That's not 100 obvious since $ is a common (universal?) RE meta-character. snip -- Ben. -- http://mail.python.org/mailman/listinfo/python-list
Re: Quickie - Regexp for a string not at the beginning of the line
On 10/25/2012 8:08 PM, Rivka Miller wrote: On Oct 25, 2:27 pm, Danny dann90...@gmail.com wrote: Why you just don't give us the string/input, say a line or two, and what you want off of it, so we can tell better what to suggest no one has really helped yet. Because there is no solution - there IS no _RE_ that will match a string not at the beginning of a line. Now if you want to know how to extract a string that matches an RE in awk, that'd be (just one way): awk 'match($0,/.[$]hello[$]/) { print substr($0,RSTART+1,RLENGTH-1) }' and other tools would have their ways of producing the same output, but that's not the question you're asking. Ed. I want to search and modify. I dont wanna be tied to a specific language etc so I just want a regexp and as many versions as possible. Maybe I should try in emacs and so I am now posting to emacs groups also, although javascript has rich set of regexp facilities. examples $hello$ should not be selected but not hello but all of the $hello$ and $hello$ ... $hello$ each one selected = original post = Hello Programmers, I am looking for a regexp for a string not at the beginning of the line. For example, I want to find $hello$ that does not occur at the beginning of the string, ie all $hello$ that exclude ^$hello$. In addition, if you have a more difficult problem along the same lines, I would appreciate it. For a single character, eg not at the beginning of the line, it is easier, ie ^[^]+ but I cant use the same method for more than one character string as permutation is present and probably for more than one occurrence, greedy or non-greedy version of [^]+ would pick first or last but not the middle ones, unless I break the line as I go and use the non- greedy version of +. I do have the non-greedy version available, but what if I didnt? If you cannot solve the problem completely, just give me a quick solution with the first non beginning of the line and I will go from there as I need it in a hurry. Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: sed/awk/perl: How to replace all spaces each with an underscore that occur before a specific string ?
On Aug 22, 1:11 pm, bolega gnuist...@gmail.com wrote: sed/awk/perl: How to replace all spaces each with an underscore that occur before a specific string ? I really prefer a sed one liner. Why? Example Input : This is my book. It is too thick to read. The author gets little royalty but the publisher makes a lot. Output: This_is_my_book._It_is_too__thick_to read. The author gets little royalty but the publisher makes a lot. We replaced all the spaces with underscores before the first occurence of the string to . No, you replaced all ... the string to (note the space). awk '{idx=index($0,to ); tgt=substr($0,1,idx-1); gsub(/ /,_,tgt); print tgt substr($0,idx)}' file Ed. -- http://mail.python.org/mailman/listinfo/python-list
Re: Comparing 2 similar strings?
William Park wrote: How do you compare 2 strings, and determine how much they are close to each other? Eg. aqwerty qwertyb are similar to each other, except for first/last char. But, how do I quantify that? I guess you can say for the above 2 strings that - at max, 6 chars out of 7 are same sequence -- 85% max But, for qawerty qwerbty max correlation is - 3 chars out of 7 are the same sequence -- 42% max (Crossposted to 3 of my favourite newsgroup.) However you like is probably the right answer, but one way might be to compare their soundex encoding (http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?soundex) and figure out percentage difference based on comparing the numeric part. Ed. -- http://mail.python.org/mailman/listinfo/python-list
Re: Comparing 2 similar strings?
John Machin wrote: On Wed, 18 May 2005 20:03:53 -0500, Ed Morton [EMAIL PROTECTED] wrote: snip I assume you were actually being facetious and trying to make the point that names that don't look the same on paper can have the same soundex encoding and that's obviously countered with the fact that soundex is just a cheap and cheerful way to find names that probably sound similair which can vary tremendously based on ethnicity or accent. *If* you want phonetic similarity, there are methods that much better than soundex, in the sense of fewer false positives and fewer false negatives. Google for NYSIIS, dolby, metaphone, caverphone. And I assume I'd find they all have pros and cons too, otherwise you'd be referring to THE best one rather than a selection. It seems a bit pointless to go browsing through the documentation on them when someone who presumably already has can't just state the best one for the job. Cheap? You get what you pay for. Cheerful? What's the relevance? Cheap and cheerful is a colloquial expression meaning cost-effective. Someone who types Mousaferiadis into a customer search screen and gets back several lines of McPherson and MacPherson is unlikely to be cheerful -- even before we factor in the speed [soundex divides the universe into a relative small number of buckets]. Someone who's looking for Erin when they should be looking for Aaron (or vice versa) won't get much cheer out of soundex, either. That goes back to accent. In [some parts at least of] the USA Erin sounds very much like Aaron wheras in the UK the 2 are very dissimilar. I assume since you apparently consider them similair that you live in the USA and so would consider soundex as providing a false negative by saying they don't match. Perhaps one of the other approaches you suggest would report that they do match but that wouldn't make it clearly a better choice to everyone. It's a reasonable approach to consider given the very loose requirements presented. Soundex is *NEVER* a reasonable approach to consider. Phonetic variation is only one consideration. In any case, the OP didn't appear to be concerned with phonetic variations. The OP didn't say what the application was at all, but you're right that from his example he does SEEM more interested in character matches than phonetic ones so he'd presumably quickly discard phonetic comparisons if that's really not what he wants. Ed. -- http://mail.python.org/mailman/listinfo/python-list