RE: natural sort via substrings
-Original Message- From: Emily Heureux [mailto:[EMAIL PROTECTED] Sent: Thursday, June 12, 2008 8:25 PM To: mysql@lists.mysql.com Subject: natural sort via substrings Hi, I am attempting to do a natural sort from within mysql, if possible. So, for example, jane2 would come before jane10, and normal strings would still sort as expected. I found some solutions, like using length for the numerical part, but that only works if the strings are the same length. Ideally, I would like to use substring_index, but stick a regexp in as the delimiter. So far, it seems you cannot do this. Does anyone know how to put a regexp as the delimiter in substring_index? For example, I want to do something like this: .order by substring_index(name, 'regexp [0-9]+', 1); Is this possible? Forgive me if I am incorrect, but wouldn't jane2 already be listed before jane10 if you just ORDER BY fieldname ASC? I suppose jane2 and jane20 would wind up next to each other if this were the case, but can you not zero-fill your values (i.e., jane02) to prevent this from happening? Sorry if my suggestion falls short of the mark, but the conditions for your test case were vague at best. :) Can you not zero-fill? Are you sorting by the entire field's value, or just a portion of it? Wouldn't substring_index() sort all jane## entries arbitrarily, since your (theoretical) example returns everything to the left of the first match (but not including the match)? Todd Boyd Web Programmer -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: natural sort via substrings
-Original Message- From: Boyd, Todd M. [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 10:33 AM To: Emily Heureux; mysql@lists.mysql.com Subject: RE: natural sort via substrings ---8--- snip Forgive me if I am incorrect, but wouldn't jane2 already be listed before jane10 if you just ORDER BY fieldname ASC? I suppose jane2 Man, what a morning. After reading what I wrote, it's obvious to me that this should not happen (jane2 coming before jane10, I mean.. 1 2). Maybe sort by length first, then sub-sort by alpha? As far as regex goes, I know of no way to use it inside the MySQL function substring_index(). Todd Boyd Web Programmer -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: natural sort via substrings
Hi, I am sorry for being so vague. The values are not as simple as jane20, jane10, or jane2. There are names like, jane-2, alex 3, alex4, and just 'jane', etc. ORDER BY sorts by either numerical OR string, not both, as far as I can tell. I'll have to think about zero-fill, but I doubt that will work given the lack of standards for the names I am getting (protein names). I need a sort that works like the way a human would sort. As for substring_index(), I was using that because the names I am dealing with often have a distinguishing number at the end, so I would like to just grab all but the number(s), and then grab just the number(s) (So, actually, I am using substring_index() twice). If I can separate out the numbers at the end from the rest of the string with a regexp delimiter, the problem is solved. I just don't know if that can be done from within ORDER BY. Thanks, Emily -Original Message- From: Boyd, Todd M. [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 8:33 AM To: Emily Heureux; mysql@lists.mysql.com Subject: RE: natural sort via substrings -Original Message- From: Emily Heureux [mailto:[EMAIL PROTECTED] Sent: Thursday, June 12, 2008 8:25 PM To: mysql@lists.mysql.com Subject: natural sort via substrings Hi, I am attempting to do a natural sort from within mysql, if possible. So, for example, jane2 would come before jane10, and normal strings would still sort as expected. I found some solutions, like using length for the numerical part, but that only works if the strings are the same length. Ideally, I would like to use substring_index, but stick a regexp in as the delimiter. So far, it seems you cannot do this. Does anyone know how to put a regexp as the delimiter in substring_index? For example, I want to do something like this: .order by substring_index(name, 'regexp [0-9]+', 1); Is this possible? Forgive me if I am incorrect, but wouldn't jane2 already be listed before jane10 if you just ORDER BY fieldname ASC? I suppose jane2 and jane20 would wind up next to each other if this were the case, but can you not zero-fill your values (i.e., jane02) to prevent this from happening? Sorry if my suggestion falls short of the mark, but the conditions for your test case were vague at best. :) Can you not zero-fill? Are you sorting by the entire field's value, or just a portion of it? Wouldn't substring_index() sort all jane## entries arbitrarily, since your (theoretical) example returns everything to the left of the first match (but not including the match)? Todd Boyd Web Programmer -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: natural sort via substrings
-Original Message- From: Emily Heureux [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 11:12 AM To: Boyd, Todd M.; mysql@lists.mysql.com Subject: RE: natural sort via substrings Hi, I am sorry for being so vague. The values are not as simple as jane20, jane10, or jane2. There are names like, jane-2, alex 3, alex4, and just 'jane', etc. ORDER BY sorts by either numerical OR string, not both, as far as I can tell. I'll have to think about zero-fill, but I doubt that will work given the lack of standards for the names I am getting (protein names). I need a sort that works like the way a human would sort. As for substring_index(), I was using that because the names I am dealing with often have a distinguishing number at the end, so I would like to just grab all but the number(s), and then grab just the number(s) (So, actually, I am using substring_index() twice). If I can separate out the numbers at the end from the rest of the string with a regexp delimiter, the problem is solved. I just don't know if that can be done from within ORDER BY. ---8--- snip Emily, From an exhaustive search of the web (including MySQL's page and others), it appears that Regular Expression support in MySQL is limited at best. All you can do with it thus far is determine whether or not something matched the given expression--1 or 0 are the only results possible, from what I can gather. You may be forced to either nest a bunch of queries using substring_index() and other server-side string manipulation functions, or sort the data after queried-extraction. :( By all means, though, do not take my information as gospel. There may very well be a way to pull off what you're trying to do... but ORDER BY REGEXP is not it. Todd Boyd Web Programmer -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: natural sort via substrings
-Original Message- From: Emily Heureux [mailto:[EMAIL PROTECTED] Sent: Thursday, June 12, 2008 8:25 PM To: mysql@lists.mysql.com Subject: natural sort via substrings Hi, I am attempting to do a natural sort from within mysql, if possible. So, for example, jane2 would come before jane10, and normal strings would still sort as expected. I'd try adding a column (e.g. sort_as) to your table, and then populate that with strings suitable for sorting (e.g. jane2 would become jane0002, jane10 = jane0010, and so on). One nice thing about this is that you can use an external language like Java, Perl, or Python to calculate your sort string. SQL is a painful choice for that task in my opinion. HTH, HAND. J -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: natural sort via substrings
From: Jim Lyons [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 12:10 PM To: Boyd, Todd M. Subject: Re: natural sort via substrings What I would do is form 2 additional fields from the first, an alphanumeric field and a numeric field so: jane-2 jane- 2 alex 3 alex 3 (maybe put a blank after the 'x' but be sure to handle it properly, trailing spaces are tricky) alex4 alex 4 jane jane (maybe null, or 0 for the numeric field, depending on the application) Then sort on the 2 fields. Depending on your application, you might want to keep the entire name field as well as the 2 derivative fields, just accept the de-normalization. You can use the entire field for display purposes. If your table is of any size whatsoever this would be vastly preferable to doing all sorts of substring-ing in your where clauses. You won't have a prayer of having the optimizer use an index. Jim On Fri, Jun 13, 2008 at 11:58 AM, Boyd, Todd M. [EMAIL PROTECTED] wrote: -Original Message- From: Emily Heureux [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 11:12 AM To: Boyd, Todd M.; mysql@lists.mysql.com Subject: RE: natural sort via substrings Hi, I am sorry for being so vague. The values are not as simple as jane20, jane10, or jane2. There are names like, jane-2, alex 3, alex4, and just 'jane', etc. ORDER BY sorts by either numerical OR string, not both, as far as I can tell. I'll have to think about zero-fill, but I doubt that will work given the lack of standards for the names I am getting (protein names). I need a sort that works like the way a human would sort. As for substring_index(), I was using that because the names I am dealing with often have a distinguishing number at the end, so I would like to just grab all but the number(s), and then grab just the number(s) (So, actually, I am using substring_index() twice). If I can separate out the numbers at the end from the rest of the string with a regexp delimiter, the problem is solved. I just don't know if that can be done from within ORDER BY. ---8--- snip Emily, From an exhaustive search of the web (including MySQL's page and others), it appears that Regular Expression support in MySQL is limited at best. All you can do with it thus far is determine whether or not something matched the given expression--1 or 0 are the only results possible, from what I can gather. You may be forced to either nest a bunch of queries using substring_index() and other server-side string manipulation functions, or sort the data after queried-extraction. :( By all means, though, do not take my information as gospel. There may very well be a way to pull off what you're trying to do... but ORDER BY REGEXP is not it. Jim, Thank you for formulating my idea into a more easily-understandable format. :) It's been a long week. What I meant to say when I mentioned sort the data after queried-extraction was that you would need to split the data extracted with a query into its sort-able parts as you described. Very well put on your part, though. Emily, You might try and find a conditional string function that will find the first digit character in a string (or a nifty manipulation of a different string function that accomplishes the same objective). I understand that's sort of along the line of what you were trying to do in the first place, but perhaps this new funneling of the problem will give you some different insight into a possible solution. I'll scour the web and documentation with the free time I may or may not get today and see what I can come up with, as well. Don't give up hope! :) This sort of data transformation is not an uncommon occurrence, and someone somewhere must have tried to tackle a similar problem--with positive results. *crosses fingers* If not, maybe someone else's idea for a solution can plant the seed. Todd Boyd Web Programmer -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: natural sort via substrings
Ok, thanks for all the information. I have tried solving the problem via perl, but I am using DBIx::Class and Catalyst, which are both relatively new to me. The way DBIC deals with grabbing data from the database combined with my limited skills had me looking for a mysql solution. You're saving me time telling me regexp in the order by clause won't work. I won't waste any more time down that road. Thank you again, -Emily -Original Message- From: Boyd, Todd M. [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 11:29 AM To: Jim Lyons; mysql@lists.mysql.com; Emily Heureux Subject: RE: natural sort via substrings From: Jim Lyons [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 12:10 PM To: Boyd, Todd M. Subject: Re: natural sort via substrings What I would do is form 2 additional fields from the first, an alphanumeric field and a numeric field so: jane-2 jane- 2 alex 3 alex 3 (maybe put a blank after the 'x' but be sure to handle it properly, trailing spaces are tricky) alex4 alex 4 jane jane (maybe null, or 0 for the numeric field, depending on the application) Then sort on the 2 fields. Depending on your application, you might want to keep the entire name field as well as the 2 derivative fields, just accept the de-normalization. You can use the entire field for display purposes. If your table is of any size whatsoever this would be vastly preferable to doing all sorts of substring-ing in your where clauses. You won't have a prayer of having the optimizer use an index. Jim On Fri, Jun 13, 2008 at 11:58 AM, Boyd, Todd M. [EMAIL PROTECTED] wrote: -Original Message- From: Emily Heureux [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 11:12 AM To: Boyd, Todd M.; mysql@lists.mysql.com Subject: RE: natural sort via substrings Hi, I am sorry for being so vague. The values are not as simple as jane20, jane10, or jane2. There are names like, jane-2, alex 3, alex4, and just 'jane', etc. ORDER BY sorts by either numerical OR string, not both, as far as I can tell. I'll have to think about zero-fill, but I doubt that will work given the lack of standards for the names I am getting (protein names). I need a sort that works like the way a human would sort. As for substring_index(), I was using that because the names I am dealing with often have a distinguishing number at the end, so I would like to just grab all but the number(s), and then grab just the number(s) (So, actually, I am using substring_index() twice). If I can separate out the numbers at the end from the rest of the string with a regexp delimiter, the problem is solved. I just don't know if that can be done from within ORDER BY. ---8--- snip Emily, From an exhaustive search of the web (including MySQL's page and others), it appears that Regular Expression support in MySQL is limited at best. All you can do with it thus far is determine whether or not something matched the given expression--1 or 0 are the only results possible, from what I can gather. You may be forced to either nest a bunch of queries using substring_index() and other server-side string manipulation functions, or sort the data after queried-extraction. :( By all means, though, do not take my information as gospel. There may very well be a way to pull off what you're trying to do... but ORDER BY REGEXP is not it. Jim, Thank you for formulating my idea into a more easily-understandable format. :) It's been a long week. What I meant to say when I mentioned sort the data after queried-extraction was that you would need to split the data extracted with a query into its sort-able parts as you described. Very well put on your part, though. Emily, You might try and find a conditional string function that will find the first digit character in a string (or a nifty manipulation of a different string function that accomplishes the same objective). I understand that's sort of along the line of what you were trying to do in the first place, but perhaps this new funneling of the problem will give you some different insight into a possible solution. I'll scour the web and documentation with the free time I may or may not get today and see what I can come up with, as well. Don't give up hope! :) This sort of data transformation is not an uncommon occurrence, and someone somewhere must have tried to tackle a similar problem--with positive results. *crosses fingers* If not, maybe someone else's idea for a solution can plant the seed. Todd Boyd Web Programmer -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]