Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-12-10 Thread David Haslam
Thanks DM, for the reminder. Even for English, when we include those modern versions that make use of contractions such as "I'm" "You've" "He's" "They're" "We'd" "She'll" "Can't" It's easy for humans to spot the fact that "m" "ve" "s" "re" "d" "ll" & "t" are not whole words in and of themselves

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-12-10 Thread DM Smith
IIRC, the StandardAnalyzer that SWORD uses doesn't allow for that. It has its own handling of the punctuation that is fixed. I've said before, the analyzer is only good for English like languages. In Him, DM On Dec 10, 2012, at 11:17 AM, David Haslam wrote: > There are some languages

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-12-10 Thread David Haslam
There are some languages in which the apostrophe is used a letter of the alphabet rather than an item of punctuation. e.g. Somali, in which the apostrophe represents the /Alef/. See http://en.wikipedia.org/wiki/Somali_alphabet Guessing that our Lucene indexing method generally strips out such pu

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-12-10 Thread David Haslam
Hi Troy, Would you be happy for me to place most of your reply as a new section in http://crosswire.org/wiki/DevTools:conf_Files ? Lightly edited for public presentation, of course. David -- View this message in context: http://sword-dev.350566.n4.nabble.com/Re-Search-bug-New-Arabic-Bible-No

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-12-09 Thread Troy A. Griffitts
It was sent in November last year IIRC Peter Original-Nachricht Datum: Mon, 26 Nov 2012 23:19:20 -0600 Von: Greg Hellings An: "SWORD Developers\' Collaboration Forum" Betreff: Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version On Mon, Nov 26

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-27 Thread niccarter
patch? It was sent in November last year IIRC > > Peter > Original-Nachricht >> Datum: Mon, 26 Nov 2012 23:19:20 -0600 >> Von: Greg Hellings >> An: "SWORD Developers\' Collaboration Forum" >> Betreff: Re: [sword-devel] Search bug

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-27 Thread Peter von Kaehne
ORD Developers\' Collaboration Forum" > Betreff: Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD > Version > On Mon, Nov 26, 2012 at 11:15 PM, Nic Carter wrote: > > My understanding is that we are currently locked into a really old > version of the

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Greg Hellings
On Mon, Nov 26, 2012 at 11:15 PM, Nic Carter wrote: > My understanding is that we are currently locked into a really old version of > the C library False. > & it is no longer being maintained. True >Instead we need to port SWORD to use the current version of the library, Already done. > whi

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Nic Carter
My understanding is that we are currently locked into a really old version of the C library & it is no longer being maintained. Instead we need to port SWORD to use the current version of the library, which is actively being maintained... I gather some work has been done on this but I'm not sure

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread pola ashraf
To: sword-devel@crosswire.org > Subject: Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD > Version > > On Mon, Nov 26, 2012 at 8:12 AM, DM Smith wrote: > > Correct. JSword uses Lucene's filter for the language, which does more > > normalizati

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Greg Hellings
On Mon, Nov 26, 2012 at 8:12 AM, DM Smith wrote: > Correct. JSword uses Lucene's filter for the language, which does more > normalization than the StandardAnalyzer which SWORD uses exclusively. The > StandardAnalyzer should only be used for "unaccented" latinate text. Same > with the SimpleAnal

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread DM Smith
Correct. JSword uses Lucene's filter for the language, which does more normalization than the StandardAnalyzer which SWORD uses exclusively. The StandardAnalyzer should only be used for "unaccented" latinate text. Same with the SimpleAnalyzer. (In Lucene, an analyzer is a filter chain which norm

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Greg Hellings
On Mon, Nov 26, 2012 at 6:22 AM, Peter von Kaehne wrote: > >> Von: David Haslam > >> So a similar patch would be necessary in principle to JSword ??? > > No. If And Bible does not have a problem, then Jsword does its job correctly. However, BibleTime would require such a patch separately since i

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Peter von Kaehne
> Von: David Haslam > So a similar patch would be necessary in principle to JSword ??? No. If And Bible does not have a problem, then Jsword does its job correctly. Peter ___ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.o

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread David Haslam
Which (I suppose) would have been a patch to the SWORD API ? So a similar patch would be necessary in principle to JSword ??? David -- View this message in context: http://sword-dev.350566.n4.nabble.com/Re-Search-bug-New-Arabic-Bible-Not-Shaped-SVD-Version-tp4651330p4651336.html Sent from the

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Peter von Kaehne
Collaboration Forum > Betreff: Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD > Version > Sorry for choosing the wrong word > this wikipedia article talking about this topic > https://en.wikipedia.org/wiki/Arabic_diacritics > > Thanks Chris for your repl

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread pola ashraf
hope someone in this list report them about all this discussion :) So now we know the problem and the solution . > Date: Mon, 26 Nov 2012 01:05:16 -0800 > From: chris...@crosswire.org > To: sword-devel@crosswire.org > Subject: Re: [sword-devel] Search bug & New Arabic Bible, Not

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Chris Little
You're talking about vowels, not shaping. Shaping in Arabic changes the shape of the letter according to its context in the word (initial, medial, final, or isolated). I imagine unshaped Arabic would be very difficult to read. Arabic without vowel marks, on the other hand, is standard. I woul

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-25 Thread pola ashraf
Using a comparison tool from ICU the two strings resulted in different character numbers Words to compare يَسُوعَ يسوع Which is the Name of JESUS Christ in Arabic but one is shaped and the other isn't Words converted to HEX Format \u064a \u064e \u0633 \u064f \u0648 \u0639 \u064e \u064a \u0633

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-25 Thread pola ashraf
I think Arabic shapes add extra Unicode characters that's why the 2 same words - i mentioned before - don't give the same results -- Any Arabic search problem is unconnected to shaping. Modules are routinely created and stored in a normalised format, user entries, e.g. for sear

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-25 Thread ref...@gmx.net
Any Arabic search problem is unconnected to shaping. Modules are routinely created and stored in a normalised format, user entries, e.g. for search ate equally normalised Sent from my HTC - Reply message - From: "David Haslam" To: Subject: [sword-devel] Search bug & Ne

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread pola ashraf
n for accuracy of some words > Date: Sat, 24 Nov 2012 09:12:25 -0800 > From: dfh...@googlemail.com > To: sword-devel@crosswire.org > Subject: Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD > Version > > Pola wrote, "The permanent solution is to make

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread David Haslam
Pola wrote, "The permanent solution is to make search indexes ignore all Arabic shapes" Indeed, this would be true for all similar scripts that used glyph shaping. Not just those in the Arabic/Persian family either. The fundamental problem has been identified and described. We really do need a p

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread David Haslam
Pola wrote, "Currently I think in using Mod2Osis to extract the OSIS source text then use Any program that can Remove all Arabic shapes then Package it again using OSIS2Mod" Please understand that a round trip using mod2osis and osis2mod is highly deprecated. Information will always be lost due t

Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread David Haslam
Pola, For several very valid reasons we never copy e-Sword source texts to make SWORD modules hosted by CrossWire. /Please do not go down that route/. David -- View this message in context: http://sword-dev.350566.n4.nabble.com/Search-bug-New-Arabic-Bible-Not-Shaped-SVD-Version-tp4651322p465

[sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread pola ashraf
Hi, Sorry for posting a lot these days :) I just do many searches, readings and experiments on CrossWire Programs and modules . I found that i can't search in the SVD bible since all words are shaped while i write not shaped search words For example searching for "يسوع" is not equal "يَسُوعَ" يَ