[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2024-03-21 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Maude  changed:

   What|Removed |Added

 CC||maude.boudr...@collecto.ca

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-09-15 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #45 from Victor Grousset/tuxayo  ---
https://sourceforge.net/p/icu/mailman/icu-support/thread/d1a3c944-02a7-4cc0-87ad-4ad30b773967%40tuxayo.net/#msg37895241

> > I think you might be able to get that by reordering spaces above
> > punctuation, and setting the first non ignorable to be the first space.
> >
> 
> No, I am pretty sure that alternate=shifted compares the primary weight
> with the maxVariable setting before script reordering.
> 
> I just tried this in the ICU Collation Demo
>  with
> 
>- rule "[reorder punct space]"
>- alternate=shifted
>- max variable=punct
> 
> and both spaces and punctuation get shifted/ignored.
> 
> So I don't think we have a way to ignore/shift anything other than
> "anything up to the max variable".
> 
> You might be best off pre-processing the strings, removing punctuation
> characters before sending the string into collation.

Quite over my head but it seems to confirm no way to get what we want with ICU
config.


If the assumptions are correct
- punctuation ignored, but whitespace not ignored is our need
- strength: quaternary" is still better than "alternate: shifted" and better
than the current sorting
  - might need more testing since it managed to masquerade as complying with
the test plan ^^"

Then immediate move forward is go on with the "strength: quaternary" patch.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-09-12 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #44 from Victor Grousset/tuxayo  ---
https://sourceforge.net/p/icu/mailman/icu-support/thread/d1a3c944-02a7-4cc0-87ad-4ad30b773967%40tuxayo.net/#msg37894734

Someone said:

> I think you might be able to get that by reordering spaces above punctuation, 
> and setting the first non ignorable to be the first space.

Does anyone know if we can do such a thing? ^^"

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-09-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #43 from Victor Grousset/tuxayo  ---
> It seems what we want is "punctuation ignored, but whitespace not ignored" 
> but I don't seem to find that option:

So in ICU itself there isn't that option, it's not about the analysis-icu
plugin that doesn't expose them?

Asked in the ICU support mailing list, let's see if we get something :)

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-09-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Victor Grousset/tuxayo  changed:

   What|Removed |Added

 Status|Signed Off  |Failed QA

--- Comment #42 from Victor Grousset/tuxayo  ---
> Restesting, this is what I get, and rereading, this is what is expected with 
> quaternary - it considers differences at the 4th level, meaning accents and 
> punctuation are both considered

oops so the patch doesn't match the specs/test plan.

So "strength: quaternary" is still better than "alternate: shifted", right? And
better than the current sorting.

Not sure there is another low cost alternative from all the previous comments.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-09-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #41 from Nick Clemens  ---
(In reply to Victor Grousset/tuxayo from comment #37)
> > 1 - Create authorities with main headings like below and confirm they sort 
> > in the order shown
> > Hand blows
> > Hand book for Prospect Park
> > Hand (Fictitious character)
> > Hand in glove
> > Hand-ball
> > Handbok for sangere
> > Handbook for adventure
> > Hande im Pflug
> > Hands in the past
> > Handu
> 
> nope T_T
> 
> i get this:
>  Hand (Fictitious character) 
>  Hand blows 
>  Hand book for Prospect Park 
>  Hand in glove 
>  Hand-ball 
>  Handbok for sangere 
>  Handbook for adventure 
>  Hande im Pflug 
>  Hands in the past 
>  Handu 
> 
> With ES 7 and opensearch 1.x
> And I double checked not using zebra instead of ES.
> My koha (ktd) was started with the patch already applied.
> 
> Any idea of what could go wrong? Any implicit step before the 1st in the
> test plan ?

Restesting, this is what I get, and rereading, this is what is expected with
quaternary - it considers differences at the 4th level, meaning accents and
punctuation are both considered

So "Hand (Fictitious character)" comes first because the space and then the '('
come before the letters

It makes other things sort too:
hand book for Prospect Park
Hand book for Prospect Park
Hand boôk for Prospect Park
Hand böok for Prospect Park

It seems what we want is "punctuation ignored, but whitespace not ignored" but
I don't seem to find that option:
https://unicode-org.github.io/icu/userguide/collation/architecture.html#strength-level
https://unicode-org.github.io/icu/userguide/collation/customization/ignorepunct.html

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-09-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #40 from Victor Grousset/tuxayo  ---
> confirm your koha-conf is pointing to the updated file form this patch:

It's as expected commented. So then it would use the default (updating by this
patch):
$config_file ||= C4::Context->config('intranetdir') .
'/admin/searchengine/elasticsearch/index_config.yaml';



> confirm you are using ktd

yes, even more «My koha (ktd) was started with the patch already applied.»

So in any case, the field_config.yaml is the right and no need to reindex or
reset the mappings (thanks for the -r, I'll keep that in mind)

Retried and still got a wrong order, hmmm 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-09-07 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #39 from Nick Clemens  ---
(In reply to Victor Grousset/tuxayo from comment #38)
> Another question: how does this change work for existing instances?
> I had a koha on main/master.
> - applied the patch
> - restarted the ES container (waited for it to finish)
> - restarted services
> - reindexed (because the container didn't have persistence)
> misc/search_tools/rebuild_elasticsearch.pl -v -d
> 
> 
> And the search still had the same order :o
> 
> 
> ---
> 
> 
> > 3 - Test biblio search sorting as well (title, author, etc)
> 
> Is it about just making sure the catalogue search and the an:XX syntax give
> sane results or something more specific?

Try -r instead of -d and confirm you are using ktd and/or confirm your
koha-conf is pointing to the updated file form this patch:
362  
363  

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-08-31 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #38 from Victor Grousset/tuxayo  ---
Another question: how does this change work for existing instances?
I had a koha on main/master.
- applied the patch
- restarted the ES container (waited for it to finish)
- restarted services
- reindexed (because the container didn't have persistence)
misc/search_tools/rebuild_elasticsearch.pl -v -d


And the search still had the same order :o


---


> 3 - Test biblio search sorting as well (title, author, etc)

Is it about just making sure the catalogue search and the an:XX syntax give
sane results or something more specific?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-08-31 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #37 from Victor Grousset/tuxayo  ---
> 1 - Create authorities with main headings like below and confirm they sort in 
> the order shown
> Hand blows
> Hand book for Prospect Park
> Hand (Fictitious character)
> Hand in glove
> Hand-ball
> Handbok for sangere
> Handbook for adventure
> Hande im Pflug
> Hands in the past
> Handu

nope T_T

i get this:
 Hand (Fictitious character) 
 Hand blows 
 Hand book for Prospect Park 
 Hand in glove 
 Hand-ball 
 Handbok for sangere 
 Handbook for adventure 
 Hande im Pflug 
 Hands in the past 
 Handu 

With ES 7 and opensearch 1.x
And I double checked not using zebra instead of ES.
My koha (ktd) was started with the patch already applied.

Any idea of what could go wrong? Any implicit step before the 1st in the test
plan ?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-08-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #36 from Victor Grousset/tuxayo  ---
> It may not comply exactly to Library of Congress Filing Rules, but Koha is 
> for the whole world and not everyone wants to comply with Library of Congress 
> Filing Rules!

That wouldn't prevent someone to submit a change in another ticket to implement
the Library of Congress Filing Rules. From a functional POV if there is no
other convention that this would conflict with, then there wouldn't be an issue
to change that last sorting rule.


> I think it is for a developer to say if it is easier and quicker to make this 
> patch comply to Library of Congress Filing Rules (i.e., treat hyphens, 
> commas, etc. as spaces)

About that, I'm not sure there is a config in ICU or Elastic that changes only
what we need without messing up all the rest. For reference if someone wants to
dig into that:
https://www.elastic.co/guide/en/elasticsearch/plugins/7.17/analysis-icu-collation-keyword-field.html
https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/Collator.html

> or just get this fix out there to make it better now 

We might be in this situation. It seems getting the last detail (which isn't
strictly the initial report but a thread that lead us to a clew of yarn ^^)
suddenly greatly raises the time to make a working solution.

Other than that, the current proposal has no known potential issue, right?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-08-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #35 from Heather  ---
Hi!

I think it would be an OK fix if it sorts " "(space) after hyphens and other
punctuation, because that is clear to the user.  It may not comply exactly to
Library of Congress Filing Rules, but Koha is for the whole world and not
everyone wants to comply with Library of Congress Filing Rules!  A fix is
NEEDED to make it better, and this would be better.

I think it is for a developer to say if it is easier and quicker to make this
patch comply to Library of Congress Filing Rules (i.e., treat hyphens, commas,
etc. as spaces) NOW and delay the fix, or just get this fix out there to make
it better now and create another bug to let the library decide if they want
spaces sorted after hyphens & other punctuation, or to have hyphens & commas &
such treated as spaces in the sorting (via a new syspref?).

--h2

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-08-03 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #34 from Victor Grousset/tuxayo  ---
splat => split (past participle)

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-08-03 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #33 from Victor Grousset/tuxayo  ---
(In reply to Nick Clemens from comment #18)
> Forgive the silly example, but "alternate: shifted" ignores punctuation and
> whitespace.
> 
> Do we want white space considered? i.e. is it correct for 'Santabna' to sort
> before 'Santa clarita'?

Thanks for catching this! :o

---

(In reply to Heather from comment #32)
> I don't think so.  In
> https://babel.hathitrust.org/cgi/pt?id=mdp.39015022080140=1up=63 
> Rule 12 states, "Words connected by a hyphen [...]

So if I understand correctly. The current proposal should sort " " (space)
after "-","," and other symbols?

How much of a problem is it? The 1st problematic cases on this ticket caused
the results to grouped by punctuation. Which was bad because it was randomly
mixing stuff that semantically was linked and should have been together, the
letters have the real meaning here. Not a hyphen.

comment 18 also shows stuff mixed that splat the group of "santa " that should
have stayed all together. 

So about sorting " " after "-" and "," => does it lead to examples of stuff
that is in the wrong place as bad as the two above cases? Or is the rule more
about having a standardized unambiguous order for the sake of consistency and
predictability even if there was nothing semantically messed up without that
rule.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-27 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #32 from Heather  ---
Hi, Aleisha--

I don't think so.  In
https://babel.hathitrust.org/cgi/pt?id=mdp.39015022080140=1up=63  Rule
12 states, "Words connected by a hyphen are always treated as separate words.
This rule applies even when the first part of a hyphenated word is a prefix
that sometimes appears as an integral part of the word. This rule also applies
to compound surnames."  So, yes, "nothing before something," but a hyphen is
treated as "nothing," i.e., a space.  So I agree with Janusz that
"Haendel-Adler, John" should be first in the example.

Most insitutions agree, too, that the order should then be:
Haendel-Adler, John
Haendel, Georg
Haendel Gregory

The Library of Congress Filing Rules go deeply into intricacies...but Koha is a
global community.  In this case, though, I think LC Filing Rules are agreeing
with most institutions worldwide that the order should be:
Haendel-Adler, John
Haendel, Georg
Haendel Gregory

If we go more deeply into the LC Filing Rules, we could come up with examples
that are different based on whether the heading is a surname, a topic, a place,
etc., and which elements of the field we're filing on, but the Rules were
written at a time when the vast majority of libraries had humans filing cards
in different card catalogs:  an author/title catalog, and a different subject
catalog.  We are now in an era when keyword searching returns results from
multiple indexes, and this makes the most sense and agrees with the basic
tenets of common filing rules:
Haendel-Adler, John
Haendel, Georg
Haendel Gregory

Aleisha--shall I test the patch again?

Nick:  I think we're really close!

Cheerio!!!
h2

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-27 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #31 from Nick Clemens  ---
If the current patches aren't working as needed, we may want to look into
building the authority sort field using:
https://metacpan.org/pod/MARC::Field::Normalize::NACO

Or something similar

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-27 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #30 from Janusz Kaczmarek  ---
Hi Aleisha,

Thank you very much for you comment.  I'm not sure in what sense this order is
correct.  If it means that the order conforms to what ICU sort with "strength:
quaternary" should return, then OK.  But I am still not sure if it conforms to
the LCC Filling Rules (especially p. 1.3.1) cited by Heated in the initial
report.

Let's take the initial "Saint" example. I am getting (with the patch):

Saint (Fictitious character)
Saint Charles School
Saint Hilaire Institute
Saint-Nazaire (France)
Saint, Eva Marie

Do you get the same order?

At the same time--according to Heated, and also to my intuition--it should be:

Saint Charles School
Saint, Eva Marie
Saint (Fictitious character)
Saint Hilaire Institute
Saint-Nazaire (France)

So, I am not sure if--assuming that I understand it well and correctly applied
the patch--we get the expected result...

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-26 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #29 from Aleisha Amohia  ---
Hi Janusz,

This is the correct order:
> 
> Haendel Gregory
> Haendel-Adler, John
> Haendel, Georg
> 

The policy is 'nothing' before 'something' so spaces will be sorted before
punctuation.

Hope that helps!

Aleisha

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-26 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #28 from Janusz Kaczmarek  ---
Hi Heater!

Thank you very much for checking it. 

I can confirm--with 'Ha wen' things look good now.  I'm working with ktd--I had
to restart plack to get the order changed.

But still there are doubts.  Could you, please, have a look at these three
names:

Haendel Gregory
Haendel-Adler, John
Haendel, Georg

(mind the presence or absence of the comma and the dash)

I get them in the above order whereas I would wish: 

Haendel-Adler, John
Haendel, Georg
Haendel Gregory

So, I am confused... Am I doing something wrong? Could you, please, check it
again?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-26 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #27 from Heather  ---
Hi, again, Janusz!

I was not able to reproduce your sorting problem.  I created a sandbox, applied
the patch, then created authority records for all the headings in the test
plan, plus a heading for:  Ha wen.

All headings sorted correctly:
Ha wen
Hagahara, Yoshiaki, 1939-
Hailstone, H.
Haim, Emmanuelle
Hall, Joseph N., 1966-
Halliwell, Stephen.
Hamburger, David
Hamel, Debra.
Hamilton-James, Charlie.
Hamilton-James, Charlie.
Hamilton-Jones, Charlie Homes and haunts.
Hamilton-Jones, Charlie Homes and haunts.
Hamilton, Karen A.
Hanʼguk Kiwŏn.
Hand-ball
Hand blows
Hand book for Prospect Park
Hand (Fictitious character)
Hand in glove
Handbok for sangere
Handbook for adventure
Hande im Pflug
Handel, George Frideric, 1685-1759.
Handel, George Frideric, 1685-1759. Alcina. Tornami a vagheggiar.
Handel, George Frideric, 1685-1759. Amadigi. Ah! spietato! e non ti muove.
Handel, George Frideric, 1685-1759. Apollo e Dafne. Felicissima quest'alma.
Handel, George Frideric, 1685-1759. Ariodante. Mio crudel martoro.
Handel, George Frideric, 1685-1759 Film and video adaptations.
Handel, George Frideric, 1685-1759 Giulio Cesare.
Handel, George Frideric, 1685-1759. Giulio Cesare. Da tempeste il legno
infranto.
Handel, George Frideric, 1685-1759. Giulio Cesare. Piangerò la sorte mia.
Handel, George Frideric, 1685-1759. Rinaldo. Lascia ch'io pianga mia cruda
sorte.
Handel, George Frideric, 1685-1759. Rinaldo. Vo' far guerra.
Handel, George Frideric, 1685-1759. Semele. Endless pleasure, endless love
(Air)
Handel, George Frideric, 1685-1759. Semele. Myself I shall adore.
Handel, George Frideric, 1685-1759. Teseo. Dolce riposo, ed innocente pace.
Handel, George Frideric, 1685-1759. Teseo. O stringerò nel sen.
Hands in the past
Handu
Harmes, Ross.

I even created two authority records for "Ha wen," one as a topical term and
another as a personal name (in case the type of heading was a problem), and
both authority records for "Ha wen" sorted correctly.

Based on this, I think the bug should remain signed off--unless Janusz's
problem can be re-created?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-23 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #26 from Heather  ---
This is not a silly question, Janusz!  'Ha wen' should be at the beginning,
like this:

Ha wen
Hand-ball
Hand blows
Hand book
Hand (Fictitious character)

If it is not sorting this way, it is not the desired effect--it is failing &
shouldn't be signed off.  The patch needs fixing then.

I don't have time to re-test before Monday, but I can try again on Monday
including the example of 'Ha wen.'  Or, if you've already tested with 'Ha wen,'
Janusz, and it's not sorting correctly, then it should fail sign off, I think.

Thank you for noticing this, Janusz!!!  Well done

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-23 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #25 from Janusz Kaczmarek  ---
(In reply to Heather from comment #24)
> Signed off!  The sort order is correct:
> 
> Hand-ball
> Hand blows
> Hand book
> Hand (Fictitious character)

Heather, a--maybe--silly question, since you have analyzed this in details: 
Having a phrase like 'Ha wen' -- should it be placed at the beginning or at the
end of the list?  

Now it is put at the end, whereas my first intuition would be: I want to see it
at the beginning (a this is as I would understand "arranged word by word, and
words are arranged character by character").  

So, am I missing something or we still do not have the desired effect...?

I would appreciate your comment.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-23 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #24 from Heather  ---
Signed off!  The sort order is correct:

Hand-ball
Hand blows
Hand book
Hand (Fictitious character)

Etc.!

I made a mistake (above) saying that Hand-ball should file, e.g., after Hand in
glove.  I tested and tested and tested, and the search order with this patch
applied is CORRECT!!  WOO-HOO!!!  ::librarian happy dance!::  (I also verified
that search orders are correct for bibs.)

I'M SO HAPPY!!!  THANK YOU THANK YOU THANK YOU, Most Awesome Community!!!

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-06-22 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Aleisha Amohia  changed:

   What|Removed |Added

 Status|Needs Signoff   |Signed Off

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-20 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #21 from Janusz Kaczmarek  ---
I have tried the patch.  Maybe I'm doing something wrong but I was unable to
get correct results.  In fact, applying the patch did not change anything in
the order--I'm getting:

Hand (Fictitious character)
Hand blows
Hand book for Prospect Park
Hand in glove
Hand-ball
Handbok for sangere
Handbook for adventure
Hande im Pflug
Hands in the past
Handu

Same problems with an alternate names set--I'm getting:

Kowal L.G.
Kowal-Kwiatkowska, Aleksandra
Kowal-Orczykowska, Anna
Kowal, Alina
Kowalak, Alina

whereas I would expect:

Kowal, Alina
Kowal-Kwiatkowska, Aleksandra
Kowal L.G.
Kowal-Orczykowska, Anna
Kowalak, Alina

Unfortunately, the original proposal (alternate: shifted) also does not work
well here (but the order changes, so there is no question of changing wrong
file).

Any suggestions?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-19 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Nick Clemens  changed:

   What|Removed |Added

 Attachment #151180|0   |1
is obsolete||

--- Comment #20 from Nick Clemens  ---
Created attachment 151471
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=151471=edit
Bug 26472: Configure Elasticsearch for better alphabetic sorting

This enhancement configures the ICU collation keyword plugin used by
Elasticsearch for sorting to better handle punctuation and whitespace in sort
fields.

Details at:
https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-collation-keyword-field.html
https://unicode-org.github.io/icu/userguide/collation/concepts.html

To test:
1 - Create authorities with main headings like below and confirm they sort in
the order shown
Hand blows
Hand book for Prospect Park
Hand (Fictitious character)
Hand in glove
Hand-ball
Handbok for sangere
Handbook for adventure
Hande im Pflug
Hands in the past
Handu
2 - Also confirm above order is correct and expected
3 - Test biblio search sorting as well (title, author, etc)

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-19 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Nick Clemens  changed:

   What|Removed |Added

 Status|In Discussion   |Needs Signoff

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-18 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #19 from Heather  ---
Hi!

> Do we want white space considered? i.e. is it correct for 'Santabna' to sort
> before 'Santa clarita'?

It is not correct.  The shorthand rule is "nothing before something." 
Therefore, "Santabna" should sort after "Santa Clarita."

This is correct:
> Santa Ana
> Santa Clarita
> Santa Claus
> Santabna
> Santana, Carlos

The example that illustrates this is in 1.1.1 of the Library of Congress Filing
Rules (cited above) (apologies for omited diacritics, but diacritical
characters are filed as their non-diacritic English language equivalents):

Hand blows
Hand book for Prospect Park
Hand in glove
Handbok for sangere
Handbook for adventure
Hande im Pflug
Hands in the past
Handu

--h2

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-18 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Nick Clemens  changed:

   What|Removed |Added

 Status|Passed QA   |In Discussion

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-18 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #18 from Nick Clemens  ---
Created attachment 151421
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=151421=edit
Screenshot of sorting after

Forgive the silly example, but "alternate: shifted" ignores punctuation and
whitespace.

Do we want white space considered? i.e. is it correct for 'Santabna' to sort
before 'Santa clarita'?

I would expect:
Santa Ana
Santa Clarita
Santa Claus
Santabna
Santana, Carlos

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-14 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Victor Grousset/tuxayo  changed:

   What|Removed |Added

 Status|Signed Off  |Passed QA
 CC||vic...@tuxayo.net

--- Comment #17 from Victor Grousset/tuxayo  ---
Works, makes sense, QA script happy, code looks good, passing QA :)

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-14 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Victor Grousset/tuxayo  changed:

   What|Removed |Added

 Attachment #150914|0   |1
is obsolete||

--- Comment #16 from Victor Grousset/tuxayo  ---
Created attachment 151180
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=151180=edit
Bug 26472: Configure Elasticsearch for better alphabetic sorting

This enhancement configures the ICU collation keyword plugin used by
Elasticsearch for sorting to better handle punctuation and whitespace in sort
fields.

Details at:
https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-collation-keyword-field.html

To test:

1. Create three authority records with the following values:

150 $a Science.
150 $a Science $v B.
150 $a Science $v C.

2. Search for your authority records with "Science" in "Heading A-Z" order

The search results will likely be in this order:

1. Science B.
2. Science C.
3. Science.

This is an unexpected order

3. Apply the patch and reindex

sudo koha-elasticsearch --rebuild -r 

4. Search for your authority records again with "Science" in "Heading A-Z"
order

Confirm your search results show in the correct order.

1. Science.
2. Science B.
3. Science C.

Sponsored-by: Education Services Australia SCIS
Signed-off-by: David Nind 
Signed-off-by: Victor Grousset/tuxayo 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

David Nind  changed:

   What|Removed |Added

 Status|Needs Signoff   |Signed Off

--- Comment #15 from David Nind  ---
(In reply to Aleisha Amohia from comment #14)
> Oops, left the punctuation off my test plan, have amended

Thanks ALeisha.

I can confirm that with the change, authorities with punctuation are now
sorting correctly (Heading A-Z). Tested using Elasticsearch 7.

Have changed the status to Signed Off.

David

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #14 from Aleisha Amohia  ---
Oops, left the punctuation off my test plan, have amended

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Aleisha Amohia  changed:

   What|Removed |Added

 Attachment #150899|0   |1
is obsolete||

--- Comment #13 from Aleisha Amohia  ---
Created attachment 150914
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=150914=edit
Bug 26472: Configure Elasticsearch for better alphabetic sorting

This enhancement configures the ICU collation keyword plugin used by
Elasticsearch for sorting to better handle punctuation and whitespace in sort
fields.

Details at:
https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-collation-keyword-field.html

To test:

1. Create three authority records with the following values:

150 $a Science.
150 $a Science $v B.
150 $a Science $v C.

2. Search for your authority records with "Science" in "Heading A-Z" order

The search results will likely be in this order:

1. Science B.
2. Science C.
3. Science.

This is an unexpected order

3. Apply the patch and reindex

sudo koha-elasticsearch --rebuild -r 

4. Search for your authority records again with "Science" in "Heading A-Z"
order

Confirm your search results show in the correct order.

1. Science.
2. Science B.
3. Science C.

Sponsored-by: Education Services Australia SCIS
Signed-off-by: David Nind 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #12 from David Nind  ---
Additional notes:

1. I added Heather's and Janusz's terms - before the patch is applied they sort
incorrectly, after the patch they sort correctly (I can attach some screenshots
if that helps).

2. After applying the patch, I restarted everything (flush_memcached and
restart_all) and I did a full re-index (koha-elasticsearch --rebuild -d -b -a
kohadev).

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Janusz Kaczmarek  changed:

   What|Removed |Added

 Status|Signed Off  |Needs Signoff

--- Comment #11 from Janusz Kaczmarek  ---
(In reply to Heather from comment #8)
> However, in my own Koha (version 22.05.11.000), the problem persists, with
> retrieved authority records sorting:
> 
> Saint (Fictitious character)
> Saint Charles, Charles
> 
> Rather than correctly as:
> Saint Charles, Charles
> Saint (Fictitious character)
> 
> So the problem exists in my Koha catalog, but I can't reproduce the problem
> in a sandbox, so can't test the patch.  I don't know whether to be happy or
> sad!:)

But did you check "Saint (Fictitious character)" vs. "Saint (Fictitious
character)" in the test environment?

I must say, with the Aleisha's examples I got always correct results, with or
without the patch (there is no punctuation).  However with the two Saints I
get:

Saint (Fictitious character)
Saint Charles, Charles

without the patch, and:

Saint Charles, Charles
Saint (Fictitious character)

with the patch applied.

Aleisha, could you please verify your test data proposal?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #10 from David Nind  ---
I've signed this off, but I have a query:

1. If I add the authority records as per step 1 of the test plan, the authority
records sort correctly both before and after the patch is applied.

2. If I add some punctuation to the terms, then before applying they don't sort
correctly, but after the patch they do. For example:
   150 $a Science.  (displays last)
   150 $a Science $v B (displays second)
   150 $a Science $v (C) (displays first)

Reading the bug, I'm assuming that the problem we are trying to solve is for
terms with punctuation.

Testing notes (using koha-testing-docker (KTD)):

1. I tested using Elasticsearch 7 - let me know if it should be tested with
other versions, and Open Search.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

David Nind  changed:

   What|Removed |Added

 Attachment #150820|0   |1
is obsolete||

--- Comment #9 from David Nind  ---
Created attachment 150899
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=150899=edit
Bug 26472: Configure Elasticsearch for better alphabetic sorting

This enhancement configures the ICU collation keyword plugin used by
Elasticsearch for sorting to better handle punctuation and whitespace in sort
fields.

Details at:
https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-collation-keyword-field.html

To test:

1. Create three authority records with the following values:

150 $a Science
150 $a Science $v B
150 $a Science $v C

2. Search for your authority records with "Science" in "Heading A-Z" order

The search results will likely be in this order:

1. Science B
2. Science C
3. Science

This is an unexpected order

3. Apply the patch and reindex

sudo koha-elasticsearch --rebuild -r 

4. Search for your authority records again with "Science" in "Heading A-Z"
order

Confirm your search results show in the correct order.

1. Science
2. Science B
3. Science C

Sponsored-by: Education Services Australia SCIS
Signed-off-by: David Nind 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

David Nind  changed:

   What|Removed |Added

 Status|Needs Signoff   |Signed Off

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #8 from Heather  ---
I added some more headings, with punctuation, since I noticed the problem
originally in headings with punctuation.  I added topical headings:

150 _ 0 Science (Fictitious character)
150 _ 0 Science Science
150 _ 0 Science-Science
150 _ 0 Science, Science

And repeated the above searches, and everything is still sorting correctly
without the patch applied, thusly, in both the staff client and the OPAC of the
sandbox:

Science
Science B
Science Bibliographies
Science Brazil Periodicals.
Science C
Science Congresses
Science fiction.
Science (Fictitious character)
Science Science
Science-Science
Science, Science
SCIENCES DU SOL

However, in my own Koha (version 22.05.11.000), the problem persists, with
retrieved authority records sorting:

Saint (Fictitious character)
Saint Charles, Charles

Rather than correctly as:
Saint Charles, Charles
Saint (Fictitious character)

So the problem exists in my Koha catalog, but I can't reproduce the problem in
a sandbox, so can't test the patch.  I don't know whether to be happy or sad!:)
--h2

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #7 from Heather  ---
I haven't been able to reproduce the problem.  I created a sandbox without
applying the patch (the sandbox is version 22.12.00.030), and created authority
records with these values:

150 _ 0 Science
150 _ 0 Science B
150 _ 0 Science $v Bibliographies
150 _ 0 Science C
150 _ 0 Science $v Congresses

(I created all of these because I wasn't sure if the plan meant literally
"Science B" in the $a, or create a heading with a subdivision that begins with
"B" after the $a Science, so I did both.)

In the staff client, searching the $a for:
Science

Displays these headings sorting correctly:
Science
Science B
Science Bibliographies
Science Brazil Periodicals.
Science C
Science Congresses

Likewise an authority search in the OPAC for "starts with" sorted by "heading
ascendent" sorts them correctly:

Science
Science B
Science Bibliographies
Science Brazil Periodicals.
Science C
Science Congresses

--h2

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-08 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #6 from Aleisha Amohia  ---
Created attachment 150820
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=150820=edit
Bug 26472: Configure Elasticsearch for better alphabetic sorting

This enhancement configures the ICU collation keyword plugin used by
Elasticsearch for sorting to better handle punctuation and whitespace in sort
fields.

Details at:
https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-collation-keyword-field.html

To test:

1. Create three authority records with the following values:

150 $a Science
150 $a Science $v B
150 $a Science $v C

2. Search for your authority records with "Science" in "Heading A-Z" order

The search results will likely be in this order:

1. Science B
2. Science C
3. Science

This is an unexpected order

3. Apply the patch and reindex

sudo koha-elasticsearch --rebuild -r 

4. Search for your authority records again with "Science" in "Heading A-Z"
order

Confirm your search results show in the correct order.

1. Science
2. Science B
3. Science C

Sponsored-by: Education Services Australia SCIS

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-08 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Aleisha Amohia  changed:

   What|Removed |Added

   Patch complexity|--- |Small patch
 Status|ASSIGNED|Needs Signoff

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-08 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Aleisha Amohia  changed:

   What|Removed |Added

  Change sponsored?|--- |Sponsored
   Assignee|koha-b...@lists.koha-commun |alei...@catalyst.net.nz
   |ity.org |
 Status|NEW |ASSIGNED

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-08 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

--- Comment #5 from Aleisha Amohia  ---
(In reply to Katrin Fischer from comment #3)
> I wonder if there could be some similarities to bug 33594

I applied the fixes from Bug 33594 but it didn't seem to solve this problem,
thank you anyway!

(In reply to Janusz Kaczmarek from comment #4)
> I'm unable to verify it right now, but maybe somebody could -- if adding
> "alternate: shifted" in admin/searchengine/elasticsearch/field_config.yaml
> here:
> 
> sort:
>   default:
> type: icu_collation_keyword
> index: false
> numeric: true
> +alternate: shifted
> 
> would solve the issue.

This worked at least for upstream, I will attach a patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-08 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Janusz Kaczmarek  changed:

   What|Removed |Added

 CC||janus...@gmail.com

--- Comment #4 from Janusz Kaczmarek  ---
I'm unable to verify it right now, but maybe somebody could -- if adding
"alternate: shifted" in admin/searchengine/elasticsearch/field_config.yaml
here:

sort:
  default:
type: icu_collation_keyword
index: false
numeric: true
+alternate: shifted

would solve the issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-08 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

David Nind  changed:

   What|Removed |Added

 CC||da...@davidnind.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-08 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Katrin Fischer  changed:

   What|Removed |Added

   See Also||https://bugs.koha-community
   ||.org/bugzilla3/show_bug.cgi
   ||?id=33594

--- Comment #3 from Katrin Fischer  ---
I wonder if there could be some similarities to bug 33594

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-05-07 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Aleisha Amohia  changed:

   What|Removed |Added

 CC||alei...@catalyst.net.nz

--- Comment #2 from Aleisha Amohia  ---
Bump, we are experiencing the same problem as described by Esther.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2023-02-24 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Fridolin Somers  changed:

   What|Removed |Added

 CC||fridolin.som...@biblibre.co
   ||m
   See Also||https://bugs.koha-community
   ||.org/bugzilla3/show_bug.cgi
   ||?id=24720

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2022-06-24 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Esther Melander  changed:

   What|Removed |Added

 CC||esth...@sodaspringsid.com

--- Comment #1 from Esther Melander  ---
We are using Elastic Search and have noticed a similar problem for authorities
with trailing periods. The trailing periods of an authority should be ignored
in the sort, but are instead showing at the end of the result list. This
problem appears in the advanced editor with the authority look-up
(ctrl-shift-L) search. If you were to search for "Cooking." with the search
parameter "contains" the result is not returned at the top of the list as
expected, but rather towards the bottom. Further complicating, if you use the
search parameter "starts with" or "exact" no results are returned. Do the same
search without the trailing period "Cooking" and you will get the expected
results.

Also, Library of Congress is moving toward minimally punctuated authorities in
which there is no closing punctuation on some authorities. As a result there
are now existing authorities following different punctuation rules. Regardless
of the source, trailing periods are throwing off the sort and I also suspect
auto linking.

Here are some additional papers on punctuation in authorities.
https://www.oclc.org/bibformats/en/6xx.html shows examples of the 600 subject
tags without punctuation at the end.

These are links to a power point and paper on minimally punctuated records put
out by Library of Congress.
https://www.loc.gov/aba/pcc/sct/documents/GuidelinesMinimallyPunctuatedMARC-SCT-2020-01.pptx
https://www.loc.gov/aba/pcc/documents/PCC-Guidelines-Minimally-Punctuated-MARC-Data.docx

In any event, Elastic Search appears to need some refinement in how punctuation
is handled to bring it into compliance with current practice.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2021-09-09 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Phil Ringnalda  changed:

   What|Removed |Added

 CC||p...@chetcolibrary.org

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2020-09-16 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Heather  changed:

   What|Removed |Added

 CC||n...@bywatersolutions.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 26472] Elasticsearch - ES - Authority record results not ordered correctly due to punctuation marks

2020-09-16 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26472

Heather  changed:

   What|Removed |Added

 CC||heather_hernan...@nps.gov

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/