At 09:32 AM 11/6/2012 -0500, Cab Vinton wrote:
We've managed to import a number of MARC records with corrupted
diacritics and my attempts to retrieve these with a report haven't met
w/ success. (Sample records in this list:
http://catalog.splnh.com/cgi-bin/koha/opac-shelves.pl?viewshelf=8.)
My thought is to search for 100, 700, etc. tags containing any
characters outside of the ASCII 32 through 126 range, but my regex
skills aren't up to the task. To wit:
Try a last line such as
WHERE NOT HEX(pname) REGEXP '^([0-7][0-9A-F])*$'
or
WHERE pname REGEXP '[^ -~]'
But I'm afraid that this may find more (all|most accented chars) than what
you're looking for (the black diamonds) - see for example a z39.50 search
for L'évolution de l'aéronautique by Jauneaud at LoC. My understanding
(and I definitely stand to be corrected) is that some cataloguers used
(maybe still do?) two characters (in themselves both valid in UTF-8) as an
[accent][letter] combination. I have asked our people, and they tell me
that when they import, they edit these out (although I can still find a
couple that sneaked into our db, but they do not appear to affect search
capability.)
Best - Paul
SELECT CONCAT('<a
href=\"/cgi-bin/koha/catalogue/detail.pl?biblionumber=',biblionumber,'\">',biblionumber,'</a>')
AS bibnumber, pname
FROM
(SELECT biblionumber,
ExtractValue(marcxml,'//datafield[@tag="100"]/subfield[@code>="a"]')
AS pname FROM biblioitems)
AS authors
WHERE pname REGEXP '[\W]'
These attempts also didn't seem to be getting me any closer:
WHERE pname REGEXP '[^a-z]'
WHERE pname LIKE '%[^a-zA-Z0-9]%'
WHERE PATINDEX('%[^a-zA-Z0-9]%',pname) > 1
Any thoughts on how to write this report? Have tried the folks over on
the MarcEdit list, but no solution as yet.
Many thanks,
Cab Vinton, Director
Sanbornton Public Library
Sanbornton, NH
Life is short. Read fast!
_______________________________________________
Koha-devel mailing list
[email protected]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/
---
Maritime heritage and history, preservation and conservation,
research and education through the written word and the arts.
<http://NavalMarineArchive.com> and <http://UltraMarine.ca>
_______________________________________________
Koha-devel mailing list
[email protected]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/