I built a string matching system, so when we get new records, from crawling 
the web, we can find out if we already have the same information in the 
database. There are many parameters to tune, and so I wanted to generate 
the F1 score for the different parameters. I can easily test True Positives 
and False Negatives by pulling all the data from the database and firing it 
at my string matching system. Every record should be a True Positive, since 
it is coming from our own database. But I also want to find the True 
Negatives and False Positives, so I am trying to write some code that will 
garble the database data. So at the REPL I do this: 

user> (def phone "+31-162-374000")

user> (def garbage "eeee")

This does what I want:

user> (clojure.string/replace phone  #"[ -]"  garbage) 
"+31eeee162eeee374000"


This doesn't:

user> (clojure.string/replace phone  #"[ -_]"  garbage) 
"eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee"

and this doesn't:

user> (clojure.string/replace phone  #"[ -\_]"  garbage) 
"eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee"

What am I doing wrong here? How do I match against an underscore?

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to