Thanks. On 6/1/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > > Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 06:57:19 > PM: > > > > -- How is 'xFxIxExLxDx1x' easier to search than 'FIELD1'? > > > > It's not easier. It is; however, accurate for the purpose at hand. > > FIELD1 isn't completely inclusive...it would miss FIELD-1... > > > > -- However, It seems to me that this kind of data manipulation > > (cleanup) needs to happen BEFORE the data enters the databse. > > > > Sometimes FIELD-1 is the ACTUAL data, with no erroneus > > characters...and sometimes it is FIELD1 with an erroneous (or > > unwanted) - character before the 1)...so cleaning the data would > > actually be corrupting some of it. > > > > I need to keep the data in it's original form, but also allow for > > querying without worrying about the special characters inside the > > column. > > > > FYI, these are part numbers off of electronic components, many of them > > coming from China...so, a Cisco part may have an MPN of RX321, or > > RX321-TR...either is valid. Now, the corresponding Chinese part > > number for the first one, may come back as RX32-1...which is out of my > > control. > > > > Cleaning the data would be the wrong approach, because it would > > actually invalidate the second Cisco part number, which MEANT to > > include the special characters. > > > > For this reason, the user wants to be able to search for 'RX321' and > > 'RX321TR' respectively, and not worry about whether the data is > > erroneous or valid; just to basically ignore all the characters and > > let a human decide what they want. > > > > If you have a more elegant solution, I'm all ears :-). > > > > On 5/27/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > > > > > > > > Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 > 04:38:40 > > > PM: > > > > > > > > > > Ya, it is a little too specific...here's why I need it. > > > > > > > > I have a client that wants to search for part numbers in his DB. The > > > > problem is, they come into his DB from external sources, with all > > > > sorts of special characters in them... > > > > > > > > So, he has fields like > > > > field_one!, > > > > fi--eld 2, > > > > @fi#eld__3xxx > > > > > > > > etc > > > > > > > > but, he wants to do a search for 'fieldone' and return the first one, > > > > 'field2' returns the second, etc...basically disregard all non-alphas > > > > padding every character in the search string. > > > > > > > > On 5/27/05, Eric Bergen <[EMAIL PROTECTED]> wrote: > > > > > I'm working on a set of UDFs for preg functions. > > > > > > > > > > > > > > > [EMAIL PROTECTED] wrote: > > > > > > > > > > >I have a hard time figuring out when you would use such a function. > I > > > do > > > > > >not believe you will be able to duplicate this behavior without > > > > > >constructing your own UDF or by writing a stored procedure. BTW, > why > > > *do* > > > > > >you want this function? > > > > > > > > > > > >Shawn Green > > > > > >Database Administrator > > > > > >Unimin Corporation - Spruce Pine > > > > > > > > > > > >Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 > > > 01:30:35 > > > > > >PM: > > > > > > > > > > > > > > > > > > > > > > > >>I'm trying to replicate this PHP behavior in a MySQL stored > procedure. > > > > > >> The purpose is to pad every character of the string with a pad > > > > > >>character. For example, if the pad character is 'x' and thestring > is > > > > > >>'STRING', the result is 'xSxTxRxIxNxGx'. > > > > > >> > > > > > >>Here is the PHP code if it helps. I'd like to use a regular > > > > > >>expression to replace, but I guess I could loop through the string > > > > > >>char by char and build a new one, it's just less elegant. Thanks > in > > > > > >>advance. > > > > > >> > > > > > >>PHP: > > > > > >>------------- > > > > > >>$regPattern = implode('x', preg_split('//', "STRING", -1, > > > > > >>PREG_SPLIT_NO_EMPTY)); > > > > > >> > > > > > > Thank you very much. I find this whole "padding" process very > > > counterintuitive. I have a few minor questions, if you don't mind. How > is > > > 'xFxIxExLxDx1x' easier to search than 'FIELD1'? Would you, could you > please > > > explain the theory behind why and when this kind of padding should be > done? > > > What problem does it solve and how is it a solution to that problem? > This > > > is completely baffling to me and I thought I had seen a lot of weird > data > > > before :-) > > > > > > However, It seems to me that this kind of data manipulation (cleanup) > needs > > > to happen BEFORE the data enters the databse. What data import > tool/process > > > is your client using? Can you not change the import process to scrub the > > > data and does it not have a better facility to interleave padding into a > > > string than a MySQL stored procedure or UDF? > > > > > > Thanks for you patience! > > > > > > > > > Shawn Green > > > Database Administrator > > > Unimin Corporation - Spruce Pine > > > > > > > > I would suggest the following technique. Keep your original data (trimmed of > leading and trailing whitespace characters) in one field and a "cleansed" > version of the same information in a second field. By cleansing you should > eliminate all non-alphanumerics from your search string. > > I have some direct insight to searching similar but different product codes > (same product/different codes & different products/same codes). My wife > works for a company that consolidates medical inventories for major hospital > and health care organizations and there is usually more than one vendor to > the same "product". This leads to the exact same inventory code confusion > you describe. In the majority of their data, the vendor will either prepend > or append a manufacturer's id code or their own id code to the "normal" part > number. Part of my wife's job is to filter through the confusion and match > identical products to each other (based on vendor's ID, manufacturer's id, > product codes, and descriptions) so that her clients no longer need to > purchase the exact same thing from multiple sources (at a higher cost per > unit) when they could be buying in bulk from a single source saving millions > of dollars. > > That's part of how they do it. I figure it may work for you too. > > > Shawn Green > Database Administrator > Unimin Corporation - Spruce Pine > >
-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]