Re: MySQL preg_split functionality?

2005-06-01 Thread Scott Klarenbach
Thanks.

On 6/1/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> 
> 
> Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 06:57:19
> PM:
> 
> 
> > -- How is 'xFxIxExLxDx1x' easier to search than 'FIELD1'?
> > 
> > It's not easier.  It is; however, accurate for the purpose at hand. 
> > FIELD1 isn't completely inclusive...it would miss FIELD-1...
> > 
> > -- However, It seems to me that this kind of data manipulation
> > (cleanup) needs to happen BEFORE the data enters the databse.
> > 
> > Sometimes FIELD-1 is the ACTUAL data, with no erroneus
> > characters...and sometimes it is FIELD1 with an erroneous (or
> > unwanted) - character before the 1)...so cleaning the data would
> > actually be corrupting some of it.
> > 
> > I need to keep the data in it's original form, but also allow for
> > querying without worrying about the special characters inside the
> > column.
> > 
> > FYI, these are part numbers off of electronic components, many of them
> > coming from China...so, a Cisco part may have an MPN of RX321, or
> > RX321-TR...either is valid.  Now, the corresponding Chinese part
> > number for the first one, may come back as RX32-1...which is out of my
> > control.
> > 
> > Cleaning the data would be the wrong approach, because it would
> > actually invalidate the second Cisco part number, which MEANT to
> > include the special characters.
> > 
> > For this reason, the user wants to be able to search for 'RX321' and
> > 'RX321TR' respectively, and not worry about whether the data is
> > erroneous or valid; just to basically ignore all the characters and
> > let a human decide what they want.
> > 
> > If you have a more elegant solution, I'm all ears :-).
> > 
> > On 5/27/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > > 
> > > 
> > > Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005
> 04:38:40
> > > PM:
> > > 
> > > 
> > > > Ya, it is a little too specific...here's why I need it.
> > > > 
> > > > I have a client that wants to search for part numbers in his DB.  The
> > > > problem is, they come into his DB from external sources, with all
> > > > sorts of special characters in them...
> > > > 
> > > > So, he has fields like 
> > > > field_one!,
> > > > fi--eld   2,
> > > > @fi#eld__3xxx
> > > > 
> > > >  etc
> > > > 
> > > > but, he wants to do a search for 'fieldone' and return the first one,
> > > > 'field2' returns the second, etc...basically disregard all non-alphas
> > > > padding every character in the search string.
> > > > 
> > > > On 5/27/05, Eric Bergen <[EMAIL PROTECTED]> wrote:
> > > > > I'm working on a set of UDFs for preg functions.
> > > > > 
> > > > > 
> > > > > [EMAIL PROTECTED] wrote:
> > > > > 
> > > > > >I have a hard time figuring out when you would use such a function.
> I
> > > do
> > > > > >not believe you will be able to duplicate this behavior without
> > > > > >constructing your own UDF or by writing a stored procedure. BTW,
> why
> > > *do*
> > > > > >you want this function?
> > > > > >
> > > > > >Shawn Green
> > > > > >Database Administrator
> > > > > >Unimin Corporation - Spruce Pine
> > > > > >
> > > > > >Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005
> > > 01:30:35
> > > > > >PM:
> > > > > >
> > > > > >
> > > > > >
> > > > > >>I'm trying to replicate this PHP behavior in a MySQL stored
> procedure.
> > > > > >> The purpose is to pad every character of the string with a pad
> > > > > >>character.  For example, if the pad character is 'x' and thestring
> is
> > > > > >>'STRING', the result is 'xSxTxRxIxNxGx'.
> > > > > >>
> > > > > >>Here is the PHP code if it helps.  I'd like to use a regular
> > > > > >>expression to replace, but I guess I could loop through the string
> > > > > >>char by char and build a new one, it's just less elegant.  Thanks
> in
> > > > > >>advance.
> > > > > >>
> > > > > >>PHP:
> > > > > >>-
> > > > > >>$regPattern =  implode('x', preg_split('//', "STRING", -1,
> > > > > >>PREG_SPLIT_NO_EMPTY));
> > > > > >>
> > > 
> > > Thank you very much. I find this whole "padding" process very
> > > counterintuitive. I have a few minor questions, if you don't mind. How
> is
> > > 'xFxIxExLxDx1x' easier to search than 'FIELD1'?  Would you, could you
> please
> > > explain the theory behind why and when this kind of padding should be
> done?
> > > What problem does it solve and how is it a solution to that problem? 
> This
> > > is completely baffling to me and I thought I had seen a lot of weird
> data
> > > before :-) 
> > > 
> > > However, It seems to me that this kind of data manipulation (cleanup)
> needs
> > > to happen BEFORE the data enters the databse. What data import
> tool/process
> > > is your client using? Can you not change the import process to scrub the
> > > data and does it not have a better facility to interleave padding into a
> > > string than a MySQL stored procedure or UDF? 
> > > 
> > > Thanks for you patience! 
> > > 
> > > 
> > > Shawn Green
> > > Database Administrator
> > > Unimin Corporation

Re: MySQL preg_split functionality?

2005-06-01 Thread SGreen
Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 06:57:19 
PM:

> -- How is 'xFxIxExLxDx1x' easier to search than 'FIELD1'?
> 
> It's not easier.  It is; however, accurate for the purpose at hand. 
> FIELD1 isn't completely inclusive...it would miss FIELD-1...
> 
> -- However, It seems to me that this kind of data manipulation
> (cleanup) needs to happen BEFORE the data enters the databse.
> 
> Sometimes FIELD-1 is the ACTUAL data, with no erroneus
> characters...and sometimes it is FIELD1 with an erroneous (or
> unwanted) - character before the 1)...so cleaning the data would
> actually be corrupting some of it.
> 
> I need to keep the data in it's original form, but also allow for
> querying without worrying about the special characters inside the
> column.
> 
> FYI, these are part numbers off of electronic components, many of them
> coming from China...so, a Cisco part may have an MPN of RX321, or
> RX321-TR...either is valid.  Now, the corresponding Chinese part
> number for the first one, may come back as RX32-1...which is out of my
> control.
> 
> Cleaning the data would be the wrong approach, because it would
> actually invalidate the second Cisco part number, which MEANT to
> include the special characters.
> 
> For this reason, the user wants to be able to search for 'RX321' and
> 'RX321TR' respectively, and not worry about whether the data is
> erroneous or valid; just to basically ignore all the characters and
> let a human decide what they want.
> 
> If you have a more elegant solution, I'm all ears :-).
> 
> On 5/27/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > 
> > 
> > Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 
04:38:40
> > PM:
> > 
> > 
> > > Ya, it is a little too specific...here's why I need it.
> > > 
> > > I have a client that wants to search for part numbers in his DB. The
> > > problem is, they come into his DB from external sources, with all
> > > sorts of special characters in them...
> > > 
> > > So, he has fields like 
> > > field_one!,
> > > fi--eld   2,
> > > @fi#eld__3xxx
> > > 
> > >  etc
> > > 
> > > but, he wants to do a search for 'fieldone' and return the first 
one,
> > > 'field2' returns the second, etc...basically disregard all 
non-alphas
> > > padding every character in the search string.
> > > 
> > > On 5/27/05, Eric Bergen <[EMAIL PROTECTED]> wrote:
> > > > I'm working on a set of UDFs for preg functions.
> > > > 
> > > > 
> > > > [EMAIL PROTECTED] wrote:
> > > > 
> > > > >I have a hard time figuring out when you would use such a 
function. I
> > do
> > > > >not believe you will be able to duplicate this behavior without
> > > > >constructing your own UDF or by writing a stored procedure. BTW, 
why
> > *do*
> > > > >you want this function?
> > > > >
> > > > >Shawn Green
> > > > >Database Administrator
> > > > >Unimin Corporation - Spruce Pine
> > > > >
> > > > >Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005
> > 01:30:35
> > > > >PM:
> > > > >
> > > > >
> > > > >
> > > > >>I'm trying to replicate this PHP behavior in a MySQL stored 
procedure.
> > > > >> The purpose is to pad every character of the string with a pad
> > > > >>character.  For example, if the pad character is 'x' and 
thestring is
> > > > >>'STRING', the result is 'xSxTxRxIxNxGx'.
> > > > >>
> > > > >>Here is the PHP code if it helps.  I'd like to use a regular
> > > > >>expression to replace, but I guess I could loop through the 
string
> > > > >>char by char and build a new one, it's just less elegant. Thanks 
in
> > > > >>advance.
> > > > >>
> > > > >>PHP:
> > > > >>-
> > > > >>$regPattern =  implode('x', preg_split('//', "STRING", -1,
> > > > >>PREG_SPLIT_NO_EMPTY));
> > > > >>
> > 
> > Thank you very much. I find this whole "padding" process very
> > counterintuitive. I have a few minor questions, if you don't mind. How 
is
> > 'xFxIxExLxDx1x' easier to search than 'FIELD1'?  Would you, could you 
please
> > explain the theory behind why and when this kind of padding should be 
done?
> > What problem does it solve and how is it a solution to that problem? 
This
> > is completely baffling to me and I thought I had seen a lot of weird 
data
> > before :-) 
> > 
> > However, It seems to me that this kind of data manipulation (cleanup) 
needs
> > to happen BEFORE the data enters the databse. What data import 
tool/process
> > is your client using? Can you not change the import process to scrub 
the
> > data and does it not have a better facility to interleave padding into 
a
> > string than a MySQL stored procedure or UDF? 
> > 
> > Thanks for you patience! 
> > 
> > 
> > Shawn Green
> > Database Administrator
> > Unimin Corporation - Spruce Pine 
> > 
> >

I would suggest the following technique. Keep your original data (trimmed 
of leading and trailing whitespace characters) in one field and a 
"cleansed" version of the same information in a second field. By cleansing 
you should eliminate all non-alphanumerics from your search string. 

I have som

Re: MySQL preg_split functionality?

2005-05-27 Thread Scott Klarenbach
-- How is 'xFxIxExLxDx1x' easier to search than 'FIELD1'?

It's not easier.  It is; however, accurate for the purpose at hand. 
FIELD1 isn't completely inclusive...it would miss FIELD-1...

-- However, It seems to me that this kind of data manipulation
(cleanup) needs to happen BEFORE the data enters the databse.

Sometimes FIELD-1 is the ACTUAL data, with no erroneus
characters...and sometimes it is FIELD1 with an erroneous (or
unwanted) - character before the 1)...so cleaning the data would
actually be corrupting some of it.

I need to keep the data in it's original form, but also allow for
querying without worrying about the special characters inside the
column.

FYI, these are part numbers off of electronic components, many of them
coming from China...so, a Cisco part may have an MPN of RX321, or
RX321-TR...either is valid.  Now, the corresponding Chinese part
number for the first one, may come back as RX32-1...which is out of my
control.

Cleaning the data would be the wrong approach, because it would
actually invalidate the second Cisco part number, which MEANT to
include the special characters.

For this reason, the user wants to be able to search for 'RX321' and
'RX321TR' respectively, and not worry about whether the data is
erroneous or valid; just to basically ignore all the characters and
let a human decide what they want.

If you have a more elegant solution, I'm all ears :-).

On 5/27/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> 
> 
> Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 04:38:40
> PM:
> 
> 
> > Ya, it is a little too specific...here's why I need it.
> > 
> > I have a client that wants to search for part numbers in his DB.  The
> > problem is, they come into his DB from external sources, with all
> > sorts of special characters in them...
> > 
> > So, he has fields like 
> > field_one!,
> > fi--eld   2,
> > @fi#eld__3xxx
> > 
> >  etc
> > 
> > but, he wants to do a search for 'fieldone' and return the first one,
> > 'field2' returns the second, etc...basically disregard all non-alphas
> > padding every character in the search string.
> > 
> > On 5/27/05, Eric Bergen <[EMAIL PROTECTED]> wrote:
> > > I'm working on a set of UDFs for preg functions.
> > > 
> > > 
> > > [EMAIL PROTECTED] wrote:
> > > 
> > > >I have a hard time figuring out when you would use such a function. I
> do
> > > >not believe you will be able to duplicate this behavior without
> > > >constructing your own UDF or by writing a stored procedure. BTW, why
> *do*
> > > >you want this function?
> > > >
> > > >Shawn Green
> > > >Database Administrator
> > > >Unimin Corporation - Spruce Pine
> > > >
> > > >Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005
> 01:30:35
> > > >PM:
> > > >
> > > >
> > > >
> > > >>I'm trying to replicate this PHP behavior in a MySQL stored procedure.
> > > >> The purpose is to pad every character of the string with a pad
> > > >>character.  For example, if the pad character is 'x' and the string is
> > > >>'STRING', the result is 'xSxTxRxIxNxGx'.
> > > >>
> > > >>Here is the PHP code if it helps.  I'd like to use a regular
> > > >>expression to replace, but I guess I could loop through the string
> > > >>char by char and build a new one, it's just less elegant.  Thanks in
> > > >>advance.
> > > >>
> > > >>PHP:
> > > >>-
> > > >>$regPattern =  implode('x', preg_split('//', "STRING", -1,
> > > >>PREG_SPLIT_NO_EMPTY));
> > > >>
> 
> Thank you very much. I find this whole "padding" process very
> counterintuitive. I have a few minor questions, if you don't mind. How is
> 'xFxIxExLxDx1x' easier to search than 'FIELD1'?  Would you, could you please
> explain the theory behind why and when this kind of padding should be done?
> What problem does it solve and how is it a solution to that problem?  This
> is completely baffling to me and I thought I had seen a lot of weird data
> before :-) 
> 
> However, It seems to me that this kind of data manipulation (cleanup) needs
> to happen BEFORE the data enters the databse. What data import tool/process
> is your client using? Can you not change the import process to scrub the
> data and does it not have a better facility to interleave padding into a
> string than a MySQL stored procedure or UDF? 
> 
> Thanks for you patience! 
> 
> 
> Shawn Green
> Database Administrator
> Unimin Corporation - Spruce Pine 
> 
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: MySQL preg_split functionality?

2005-05-27 Thread SGreen
Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 04:38:40 
PM:

> Ya, it is a little too specific...here's why I need it.
> 
> I have a client that wants to search for part numbers in his DB.  The
> problem is, they come into his DB from external sources, with all
> sorts of special characters in them...
> 
> So, he has fields like 
> field_one!,
> fi--eld   2,
> @fi#eld__3xxx
> 
>  etc
> 
> but, he wants to do a search for 'fieldone' and return the first one,
> 'field2' returns the second, etc...basically disregard all non-alphas
> padding every character in the search string.
> 
> On 5/27/05, Eric Bergen <[EMAIL PROTECTED]> wrote:
> > I'm working on a set of UDFs for preg functions.
> > 
> > 
> > [EMAIL PROTECTED] wrote:
> > 
> > >I have a hard time figuring out when you would use such a function. I 
do
> > >not believe you will be able to duplicate this behavior without
> > >constructing your own UDF or by writing a stored procedure. BTW, why 
*do*
> > >you want this function?
> > >
> > >Shawn Green
> > >Database Administrator
> > >Unimin Corporation - Spruce Pine
> > >
> > >Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 
01:30:35
> > >PM:
> > >
> > >
> > >
> > >>I'm trying to replicate this PHP behavior in a MySQL stored 
procedure.
> > >> The purpose is to pad every character of the string with a pad
> > >>character.  For example, if the pad character is 'x' and the string 
is
> > >>'STRING', the result is 'xSxTxRxIxNxGx'.
> > >>
> > >>Here is the PHP code if it helps.  I'd like to use a regular
> > >>expression to replace, but I guess I could loop through the string
> > >>char by char and build a new one, it's just less elegant.  Thanks in
> > >>advance.
> > >>
> > >>PHP:
> > >>-
> > >>$regPattern =  implode('x', preg_split('//', "STRING", -1,
> > >>PREG_SPLIT_NO_EMPTY));
> > >>

Thank you very much. I find this whole "padding" process very 
counterintuitive. I have a few minor questions, if you don't mind. How is 
'xFxIxExLxDx1x' easier to search than 'FIELD1'?  Would you, could you 
please explain the theory behind why and when this kind of padding should 
be done? What problem does it solve and how is it a solution to that 
problem?  This is completely baffling to me and I thought I had seen a lot 
of weird data before :-)

However, It seems to me that this kind of data manipulation (cleanup) 
needs to happen BEFORE the data enters the databse. What data import 
tool/process is your client using? Can you not change the import process 
to scrub the data and does it not have a better facility to interleave 
padding into a string than a MySQL stored procedure or UDF?

Thanks for you patience!

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine



Re: MySQL preg_split functionality?

2005-05-27 Thread Scott Klarenbach
Ya, it is a little too specific...here's why I need it.

I have a client that wants to search for part numbers in his DB.  The
problem is, they come into his DB from external sources, with all
sorts of special characters in them...

So, he has fields like 
field_one!,
fi--eld   2,
@fi#eld__3xxx

 etc

but, he wants to do a search for 'fieldone' and return the first one,
'field2' returns the second, etc...basically disregard all non-alphas
padding every character in the search string.

On 5/27/05, Eric Bergen <[EMAIL PROTECTED]> wrote:
> I'm working on a set of UDFs for preg functions.
> 
> 
> [EMAIL PROTECTED] wrote:
> 
> >I have a hard time figuring out when you would use such a function. I do
> >not believe you will be able to duplicate this behavior without
> >constructing your own UDF or by writing a stored procedure. BTW, why *do*
> >you want this function?
> >
> >Shawn Green
> >Database Administrator
> >Unimin Corporation - Spruce Pine
> >
> >Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 01:30:35
> >PM:
> >
> >
> >
> >>I'm trying to replicate this PHP behavior in a MySQL stored procedure.
> >> The purpose is to pad every character of the string with a pad
> >>character.  For example, if the pad character is 'x' and the string is
> >>'STRING', the result is 'xSxTxRxIxNxGx'.
> >>
> >>Here is the PHP code if it helps.  I'd like to use a regular
> >>expression to replace, but I guess I could loop through the string
> >>char by char and build a new one, it's just less elegant.  Thanks in
> >>advance.
> >>
> >>PHP:
> >>-
> >>$regPattern =  implode('x', preg_split('//', "STRING", -1,
> >>PREG_SPLIT_NO_EMPTY));
> >>
> >>--
> >>MySQL General Mailing List
> >>For list archives: http://lists.mysql.com/mysql
> >>To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
> >>
> >>
> >>
> >
> >
> >
> 
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: MySQL preg_split functionality?

2005-05-27 Thread Eric Bergen

I'm working on a set of UDFs for preg functions.


[EMAIL PROTECTED] wrote:

I have a hard time figuring out when you would use such a function. I do 
not believe you will be able to duplicate this behavior without 
constructing your own UDF or by writing a stored procedure. BTW, why *do* 
you want this function?


Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 01:30:35 
PM:


 


I'm trying to replicate this PHP behavior in a MySQL stored procedure.
The purpose is to pad every character of the string with a pad
character.  For example, if the pad character is 'x' and the string is
'STRING', the result is 'xSxTxRxIxNxGx'.

Here is the PHP code if it helps.  I'd like to use a regular
expression to replace, but I guess I could loop through the string
char by char and build a new one, it's just less elegant.  Thanks in
advance.

PHP:
-
$regPattern =  implode('x', preg_split('//', "STRING", -1,
PREG_SPLIT_NO_EMPTY));

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

   



 




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: MySQL preg_split functionality?

2005-05-27 Thread SGreen
I have a hard time figuring out when you would use such a function. I do 
not believe you will be able to duplicate this behavior without 
constructing your own UDF or by writing a stored procedure. BTW, why *do* 
you want this function?

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Scott Klarenbach <[EMAIL PROTECTED]> wrote on 05/27/2005 01:30:35 
PM:

> I'm trying to replicate this PHP behavior in a MySQL stored procedure.
>  The purpose is to pad every character of the string with a pad
> character.  For example, if the pad character is 'x' and the string is
> 'STRING', the result is 'xSxTxRxIxNxGx'.
> 
> Here is the PHP code if it helps.  I'd like to use a regular
> expression to replace, but I guess I could loop through the string
> char by char and build a new one, it's just less elegant.  Thanks in
> advance.
> 
> PHP:
> -
> $regPattern =  implode('x', preg_split('//', "STRING", -1,
> PREG_SPLIT_NO_EMPTY));
> 
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
> 


Re: Mysql Preg_split functionality

2005-05-27 Thread Scott Klarenbach
Here's what I came up with in case anyone else needs a quick fix.  A
regular expression replace would've been nicer, but, you do what you
gotta do...

CREATE PROCEDURE `test`(`par` varchar(60))
BEGIN
DECLARE nChars INT DEFAULT CHAR_LENGTH(par);
DECLARE nCounter INT DEFAULT 1;
DECLARE sPattern VARCHAR(300) DEFAULT '';
DECLARE sRegEx VARCHAR(60) DEFAULT '[^a-zA-Z0-9]*';

-- pad the regex pattern on each side of every character in the search 
string
WHILE nCounter <= nChars DO
SET sPattern = CONCAT(sPattern, sRegEx, SUBSTR(par, nCounter, 
1));
SET nCounter = nCounter + 1;
END WHILE;
-- add the pattern to the end of the string too
SET sPattern = CONCAT(sPattern, sRegEx);

END|

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



MySQL preg_split functionality?

2005-05-27 Thread Scott Klarenbach
I'm trying to replicate this PHP behavior in a MySQL stored procedure.
 The purpose is to pad every character of the string with a pad
character.  For example, if the pad character is 'x' and the string is
'STRING', the result is 'xSxTxRxIxNxGx'.

Here is the PHP code if it helps.  I'd like to use a regular
expression to replace, but I guess I could loop through the string
char by char and build a new one, it's just less elegant.  Thanks in
advance.

PHP:
-
$regPattern =  implode('x', preg_split('//', "STRING", -1,
PREG_SPLIT_NO_EMPTY));

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]