Re: [hackers] [PATCH][sbase] sort: Don't do top-level sort when -c is used with -k

2020-01-02 Thread Richard Ipsum
On Wed, Jan 01, 2020 at 09:39:18PM -0800, Michael Forney wrote:
> On 2020-01-01, Richard Ipsum  wrote:
[snip]
> 
> I think the following diff should cover those cases as well:
> 
> diff --git a/sort.c b/sort.c
> index a51997f..fbb1abf 100644
> --- a/sort.c
> +++ b/sort.c
> @@ -385,7 +385,8 @@ main(int argc, char *argv[])
>   /* -b shall only apply to custom key definitions */
>   if (TAILQ_EMPTY() && global_flags)
>   addkeydef("1", global_flags & ~(MOD_STARTB | MOD_ENDB));
> - addkeydef("1", global_flags & MOD_R);
> + if (TAILQ_EMPTY() || (!Cflag && !cflag))
> + addkeydef("1", global_flags & MOD_R);
> 
>   if (!argc) {
>   if (Cflag || cflag) {
> 
> I'm still not convinced of the value of this tie-breaker keydef, so
> another option might be to just only add it if no -k flag is
> specified:
> 
> diff --git a/sort.c b/sort.c
> index a51997f..adf1d6d 100644
> --- a/sort.c
> +++ b/sort.c
> @@ -383,9 +383,8 @@ main(int argc, char *argv[])
>   } ARGEND
> 
>   /* -b shall only apply to custom key definitions */
> - if (TAILQ_EMPTY() && global_flags)
> + if (TAILQ_EMPTY())
>   addkeydef("1", global_flags & ~(MOD_STARTB | MOD_ENDB));
> - addkeydef("1", global_flags & MOD_R);
> 
>   if (!argc) {
>   if (Cflag || cflag) {
> 
> I think I will apply the first diff for now to fix the bug, and
> perhaps propose the second as a patch to the list, since it involves
> an unrelated behavior change (order of equal lines according to
> options passed to sort).

First diff looks good, as does the second, personally I don't feel too
strongly either way about the presence of the tie-breaker, I think other
sorts do something like it, so for this reason it may be worth keeping
it.

Thanks,
Richard



Re: [hackers] [PATCH][sbase] sort: Don't do top-level sort when -c is used with -k

2020-01-01 Thread Michael Forney
On 2020-01-01, Richard Ipsum  wrote:
> ---
>  sort.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/sort.c b/sort.c
> index fc76738..c586394 100644
> --- a/sort.c
> +++ b/sort.c
> @@ -383,10 +383,15 @@ main(int argc, char *argv[])
>   usage();
>   } ARGEND
>
> - /* -b shall only apply to custom key definitions */
> - if (TAILQ_EMPTY() && global_flags)
> - addkeydef("1", global_flags & ~(MOD_STARTB | MOD_ENDB));
> - addkeydef("1", global_flags & MOD_R);
> + if (TAILQ_EMPTY()) {
> + if (global_flags) {
> + /* -b shall only apply to custom key definitions */
> + addkeydef("1", global_flags & ~(MOD_STARTB | MOD_ENDB));
> + }
> + addkeydef("1", global_flags & MOD_R);
> + } else if (!Cflag && !cflag) {
> + addkeydef("1", global_flags & MOD_R);
> + }
>
>   if (!argc) {
>   if (Cflag || cflag) {

I've been reading up on the various sort(1) options, and trying to
understand the original intent of this code.

My current understanding is that when no -k flag is present, it adds a
keydef for all fields with the global flags, and regardless, it adds a
keydef for all fields with no flags (except -r) so that there is some
tie-breaker for lines that are equal according to the specified flags.
This tie-breaker is fine for printing the lines, but should not be
used when checking if the file is already sorted.

Consider `sort -f -c` with the input

a
A

This should return 0 since -f considers all lowercase characters as
uppercase, but currently it returns 1, so it is not only a problem
with -k.

I think the following diff should cover those cases as well:

diff --git a/sort.c b/sort.c
index a51997f..fbb1abf 100644
--- a/sort.c
+++ b/sort.c
@@ -385,7 +385,8 @@ main(int argc, char *argv[])
/* -b shall only apply to custom key definitions */
if (TAILQ_EMPTY() && global_flags)
addkeydef("1", global_flags & ~(MOD_STARTB | MOD_ENDB));
-   addkeydef("1", global_flags & MOD_R);
+   if (TAILQ_EMPTY() || (!Cflag && !cflag))
+   addkeydef("1", global_flags & MOD_R);

if (!argc) {
if (Cflag || cflag) {

I'm still not convinced of the value of this tie-breaker keydef, so
another option might be to just only add it if no -k flag is
specified:

diff --git a/sort.c b/sort.c
index a51997f..adf1d6d 100644
--- a/sort.c
+++ b/sort.c
@@ -383,9 +383,8 @@ main(int argc, char *argv[])
} ARGEND

/* -b shall only apply to custom key definitions */
-   if (TAILQ_EMPTY() && global_flags)
+   if (TAILQ_EMPTY())
addkeydef("1", global_flags & ~(MOD_STARTB | MOD_ENDB));
-   addkeydef("1", global_flags & MOD_R);

if (!argc) {
if (Cflag || cflag) {

I think I will apply the first diff for now to fix the bug, and
perhaps propose the second as a patch to the list, since it involves
an unrelated behavior change (order of equal lines according to
options passed to sort).



[hackers] [PATCH][sbase] sort: Don't do top-level sort when -c is used with -k

2020-01-01 Thread Richard Ipsum
---
 sort.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/sort.c b/sort.c
index fc76738..c586394 100644
--- a/sort.c
+++ b/sort.c
@@ -383,10 +383,15 @@ main(int argc, char *argv[])
usage();
} ARGEND
 
-   /* -b shall only apply to custom key definitions */
-   if (TAILQ_EMPTY() && global_flags)
-   addkeydef("1", global_flags & ~(MOD_STARTB | MOD_ENDB));
-   addkeydef("1", global_flags & MOD_R);
+   if (TAILQ_EMPTY()) {
+   if (global_flags) {
+   /* -b shall only apply to custom key definitions */
+   addkeydef("1", global_flags & ~(MOD_STARTB | MOD_ENDB));
+   }
+   addkeydef("1", global_flags & MOD_R);
+   } else if (!Cflag && !cflag) {
+   addkeydef("1", global_flags & MOD_R);
+   }
 
if (!argc) {
if (Cflag || cflag) {
-- 
2.24.1